Calculating error bounds for standard deviation given a limited sample size

Question

Suppose I measure 100 samples of a normal distribution and use them to compute a standard deviation.

Is there a way to compute +/- error bounds on my computed mean value for standard deviation if I want to know what the true standard deviation is with 95% confidence level had I measured 1 million samples instead of the original 100 samples?

Practical application: I characterize 100 units with the intent to create a max and a min specification for my product's standard deviation. Customer buys 1 million units and wants to know with 95% confidence what max and min values for standard deviation we guarantee. How can I create a specification for my product's datasheet that satisfies customer's interest when I measure less and the customer buy's more?

You may be asking this: We have i.i.d. $\{X_i\}$ that are $N(m,\sigma^2)$ with unknown $m, \sigma^2$. We define the sample mean $M_n=\frac{1}{n}\sum_{i=1}^nX_i$ and sample variance $V_n=\frac{1}{n-1}\sum_{i=1}^n(X_i-M_n)^2$. If so, you can find confidence intervals for $V_n$ (in comparison to true $\sigma^2$) using the fact $\frac{(n-1)V_n}{\sigma^2}$ is chi-square distribution with $n-1$ degrees of freedom. This is a standard statistics calculation. — Michael
– Michael, Commented Jun 30 at 23:13

Pavan C. · Accepted Answer · 2025-06-30 23:11:50Z

2

I'm assuming you're computing a sample standard deviation: $$ s^2 = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})^2 $$

Typically, this formulation of the sample standard deviation has the following distribution: $$ \frac{(n-1)s^2}{\sigma^2} \sim \chi^2(df = n-1) $$

So for the 95% confidence interval we get $$ \left(\frac{(n-1)s^2}{U}, \frac{(n-1)s^2}{L} \right) $$

where $U$ and $L$ are where the chi-squared tail probability is $0.025$. Using a calculator or stats package, with $n = 100$, you would get $U \approx 128.42$ and $L \approx 73.36$, so $$ CI(s^2) \approx \left(0.771s^2, 1.350s^2\right) $$

You can just take square roots at the end to find an approximate CI for $s$. Although it is technically slightly biased ($E(s^2) = \sigma^2$, but $E(s) < \sigma$), it shouldn't be an issue for $n = 100$.

Note--$\chi^2$ is an asymmetric distribution, so the confidence interval is asymmetric in turn.

answered Jun 30 at 23:11

Pavan C.

1,7437 silver badges9 bronze badges

$\begingroup$ If I'm interpreting this correctly, the CI you compute above corresponds to the n=100 sample population. What I'm trying to estimate is the corresponding CI for n=1000000 sample population, when all I have measured is n=100 samples. Is this possible, if we assume the distribution is normal? Or, it won't change based on population size (n) so the same CI applies? $\endgroup$

user1657949
– user1657949

2025-07-02 04:38:50 +00:00
Commented Jul 2 at 4:38
$\begingroup$ I would look at $1,\!000,\!000$ (plus infinite counterfactuals that were never produced) as the entire population. Out of that population, you are sampling $n=100$ to study. You can only extract a CI from the ones you measure--you cannot just make $n = 1,\!000,\!000$ unless you measure all of those ones too. $\endgroup$

Pavan C.
– Pavan C.

2025-07-02 04:47:02 +00:00
Commented Jul 2 at 4:47
$\begingroup$ Got it. Thank you - very useful. $\endgroup$

user1657949
– user1657949

2025-07-02 04:53:18 +00:00
Commented Jul 2 at 4:53

Add a comment |

Stack Exchange Network

Calculating error bounds for standard deviation given a limited sample size

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Calculating error bounds for standard deviation given a limited sample size

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions