Sample size calculation for a single group and a single binary outcome measure

Question

I have been asked to do a power calculation for a pre-post observational study. The researchers want me to calculate the sample size needed to detect a clinically meaningful number of clients attending a pain clinic who report a 1-point or greater reduction in pain rating on a standard 0-10 pain rating scale.

If it was calculating sample size required to detect a 1-point average reduction in the pain scale that would be ok, I know how to do that. But what they want is to calculate sample size required to estimate proportion of clients who have 1-point or more reduction from start of treatment to four weeks from start of treatment.

I am struggling to conceptualise how to do this. Usually an odds ratio of 2 is considered to be the minimum effect size for a difference in proportion of 'cases' pr 'target events' between groups. I was thinking a similar sort of reduction might be the threshold minimum for this study - a single group, pre-post design - as well. But I don't really know how to do it since (a) there is no group comparison, just a single number (2) the proportion of 'cases' at start of treatment is 100%, making odds of the 'target even' infinite.

As you can see I'm very confused. Any help appreciated.

Edit

From conversations below and a bit of thinking I have concluded that what my colleagues want me to do is calculate sample size for estimating a population proportion of people attending a pain clinic who have a reduction of one-point or more between start of treatment and four weeks in to treatment, within a margin of error

I spent yesterday trying to figure out how to do this and have posted some R code below

# Define parameters
confidence_level <- 0.95 # Desired confidence level
margin_of_error <- 0.05  # Desired margin of error (e.g., 5%)
estimated_proportion <- 0.5 # Estimated population proportion. If 
    # unknown, using p = 0.5 maximizes the required sample size, 
    # providing a conservative estimate

# Calculate Z-score
# For a two-tailed confidence interval, we use 
# qnorm(1 - alpha/2)  

alpha <- 1 - confidence_level 
    # is calculated from the confidence level
z_score <- qnorm(1 - alpha/2) 
    # calculates the Z-score corresponding to the specified  
    # confidence level. For a 95% confidence level, alpha is 0.05, 
    # and qnorm(0.975) returns 1.96

# Calculate sample size
sample_size <- (z_score^2 * estimated_proportion * 
    (1 - estimated_proportion)) / (margin_of_error^2)

# Round up to the nearest whole number as sample size 
# must be an integer  

sample_size <- ceiling(sample_size)

# Print the result
print(paste("Required sample size:", sample_size))

# output
[1] "Required sample size: 385"

So I get a sample size of 385. Interested in whether people think my statistical approach is valid. Obviously there are deep flaws in the methodology of the study (i.e. no control group) but that is not my choice.

This statement "a meaningful reduction in proportion of clients who have 1-point or more reduction from start of treatment to four weeks from start of treatment." does not make sense. You could look at the proportion who have a 1 point or more change, but you can't look at a reduction in this number. You don't have the right data. — Peter Flom
– Peter Flom, Commented Nov 9 at 1:17
Good point @Peter Flom, yet more evidence of my confusion. So I guess they want me to estimate proportion who have clinically meaningful reduction. So in that case what need is there to even do a sample size calculation in that case? To calculate margin of error? — llewmills
– llewmills, Commented Nov 9 at 4:49
I think I've sorted out in my head what I need. I need to calculate the sample size required to estimate the proportion of people who enter a pain clinic who experience a 1-point or more reduction in pain score from start of treatment to 4-weeks in to treatment with a (say) 5% margin of error. Does that sound more like a well-framed question? I am not used to this sort of power analysis at all, having mainly done sample size calculations for case-control and RCTs — llewmills
– llewmills, Commented Nov 9 at 5:45
You really don't want to do this without proper control (as I understand it's been planned now). Not only is your outcome quite subjective, if you start with everyone above some threshold then by design you expect a decrease on average even without any meaningful intervention. This is really one of those cases where, if you are not able to do it properly, you'd better not do it at all: the outcome will not tell you what you think it'll tell you. — PBulls
– PBulls, Commented Nov 9 at 20:31

kjetil b halvorsen · Accepted Answer · 2025-11-13 15:07:05Z

4

Your code looks good in general, it looks like the standard approach for proportions.

I like to use simulations to confirm (and sometimes find) sample size estimates. Here is some possible R code:

n <- 385
p <- 0.5

B <- 10000

x <- rbinom(B, n, p)

out <- sapply(x, function(xx) binom.test(xx, n)$conf.int)

me <- (out[2,] - out[1,])/2
mean(me)
median(me)
hist(me)

# or

out <- sapply(x, function(xx) prop.test(xx, n)$conf.int)

me <- (out[2,] - out[1,])/2
mean(me)
median(me)
hist(me)

When I run this, the mean and median for the margins of error is slightly higher than 0.05, but more importantly, the margins of error are highly skewed, which you may want to take into account. Change the n <- line to different values and rerun until you find a sample size with properties that you are happy with.

edited Nov 13 at 15:07

kjetil b halvorsen♦

85.5k32 gold badges216 silver badges694 bronze badges

answered Nov 10 at 18:53

Greg Snow

54.5k2 gold badges113 silver badges196 bronze badges

$\begingroup$ Thanks @Greg Snow, much appreciated. I also use sampling for power analysis, but all my experience so far has been group comparisons and repeated measures/mixed effects models. This was all new to me. Think I'm on the right track now. $\endgroup$

llewmills
– llewmills

2025-11-10 19:51:22 +00:00
Commented Nov 10 at 19:51

Add a comment |

Nuclear Hoagie · Accepted Answer · 2025-11-13 15:32:27Z

The prop.test() function in R is also useful for this problem, as it gives you the 95% confidence interval around a proportion observed in some particular sample size. Here's some code to plot the 95% confidence interval around a proportion of 0.5 (which as you point out, gives the widest bound) for sample sizes 1 through 1000. Using this function, we find that the confidence interval shrinks to within +/-5% at a value of N=381.

nRange = 1000
confInt = c()

for (i in c(1:nRange))
{
  confInt = rbind(confInt,prop.test(i*0.5,i)$conf.int)
}

plot(1:nRange, confInt[,1], type = "l", ylim = c(0,1), xlab = "N",ylab = "Confidence Bounds")
par(new=TRUE)
plot(1:nRange, confInt[,2],type = "l", ylim = c(0,1),xaxt = "n", yaxt = "n", xlab = "", ylab = "")
abline(h = 0.5, lty = 2, col = "red")
abline(h = c(0.55, 0.45),lty = 2)

print(prop.test(190.5, 381))

$\begingroup$ Brilliant. Thank you $\endgroup$

llewmills
– llewmills

2025-11-14 19:11:27 +00:00
Commented Nov 14 at 19:11 — llewmills
– llewmills, Commented Nov 14 at 19:11

Stack Exchange Network

Sample size calculation for a single group and a single binary outcome measure

2 Answers 2

Your Answer

Hot Network Questions

Sample size calculation for a single group and a single binary outcome measure

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions