What is a confidence interval?
In every drug manufacturing process, controls are mandatory. Thinking about the patient, the goal should be to implement processes that are 100% error free, but that is practically impossible. For example, it is not feasible to unpack, check, and analyse every tablet from a batch to check that no mistake has been overlooked.
For this reason, a sample is collected randomly from the batch and examined for any unacceptable parameters. Then, based on the results, an estimate is made that relates to the entire batch. This is where the factor “confidence interval” comes into play.
Thus, values determined during the in-process controls (IPCs) are compared with previously defined target values. Supposing production adheres to these values very precisely, the real values in a normal distribution will be distributed very closely around this target value. Simply said, considering a case where a tablet is expected to contain 400 mg of active ingredient. The tablets produced would seldom match the expected active ingredient content but would rather contain something between 390 mg and 410 mg. As the producer strives to meet the 400 mg mark, the chances are high that the content for most tablets would vary between 395 mg and 405 mg. If it would be feasible to plot all the tablets of the batch and their respective active ingredient weight, a classical normal distribution curve would be obtained. From this, the "real" mean value, the so-called expectation value µ, can be determined. As a quality feature, in this case, the “real mean” value was defined to be between 395 mg and 405 mg as mentioned in the specification.
As mentioned earlier, it is non-viable to examine every single tablet and hence the expectation value will remain unknown to us. Since it is anyways desired to know whether the current batch of tablets fits within the range of the target value, sampling is done with e.g. 100 tablets, and then analysed. There is a high probability that the mean of the sample would differ from the “true” mean. In other words, it is vastly unlikely that the expected mean value will exactly match the mean value of the sample. So, it is fair to say that the “real” mean value cannot be determined without destroying the entire batch and that indeed would be a genuine problem.
To address this issue, a “range” rather than a single value is chosen within which the “true” value is presumed to fit in. This range is called “confidence interval”. The larger this range is, the likelihood of it containing the “true” value becomes higher. Now, it is obvious to think about considering a widest possible confidence interval so that the “real mean value is always included. But, with increasing the confidence interval, the precision of the estimate decreases. Thus, we would like to have a small confidence interval. This raises another important question i.e., in a randomly drawn sample, there is the possibility of the presence of outliers, located at the outer edge of the normal distribution of the active substance’ quantity. The chance of these outliers containing the “true mean”, even with a confidence interval, is slim. Although the probability of having such outliers in the sample is low, it cannot be neglected. To include this probability, a so-called “confidence level” is defined. A confidence level of 95% essentially means that 95% of the confidence interval samples contain the "true" mean. This allows 5% of the values to be outliers.
But how to define the confidence interval? The width of the confidence interval depends on the standard deviation (σ) of the sample and the selected confidence level. The choice of the confidence level can be made individually. In the pharmaceutical industry, a confidence level of 95% is usually applied. The selected confidence level is then used to determine the critical z-value using a normal distribution table. For a confidence level of 95%, this results in a z-value of 1.96. Using this z-value, the standard deviation of the sample and the sample size (n), the confidence interval (CI) for the true value can be determined according to the formula:
CI for µ = x ± z * σ / √(n)
Let's say a total of 100 tablets were analyzed and a mean (x) of 398.8 mg and a standard deviation of 12.4 mg were obtained. The confidence limits for our confidence interval may be calculated as follows:
µ = x ± 1.96 * σ / √(n) = 398.8 mg ± 1.96 * 12.4 mg / √(100) = 398.8 mg ± 2.43 mg
That is, the "true" mean µ is within the confidence interval [396.4; 401.2] and would thus be within our specification (between 395 and 405 mg).
The higher the standard deviation in the sample (thus the statistical spread), the larger is the confidence interval. This also explains why a broad confidence interval is a sign of a poor estimate. At the same time, confidence interval can be reduced by increasing the sampling size. With an increase in the sampling size, the scatter is reduced and so is the standard deviation because, in this case, the more tablets being analyzed, the closer it gets to the "true" normal distribution. For example, if only 10 tablets would be examined, confidence limits of 7.8 would be obtained for the same mean and standard deviation, thus resulting in a confidence interval [391.0; 406.6] being no longer compliant with the specification. Hence, the lot would not be allowed to be released.
To conclude, the larger the sampling size (n) chosen, the smaller the confidence interval and the more precise our estimate. The choice of sampling size is therefore crucial for the trustworthiness of the control of the manufacturing process whereas the sample size alone is less decisive. Let's take the tablet example: If 10 random samples of 10 tablets per batch are taken, estimate is less precise than if 1 sample of 100 tablets is taken, even though the same number of tablets, 100 pieces, are used.
Finally, for method validation, the knowledge of the calculation of confidence intervals is necessary as it is recommended by the ICH Q2(R1) guideline for the parameter trueness as well as for all types of precision. If is useful or not and what other possible applications for confidence intervals exist in respect to method validation is detailed in this blog article.
But speaking of possible applications... Confidence intervals can be created not only around a single value, but also around a trend. This makes sense, for example, in stability studies to determine the shelf life of tablets before they are marketed. To do this, we use stability-indicating methods at specified times to track, for example, the content of the tablets of a certain number of batches (e.g. 3) and compare it to the specification. If we plot the content on the y-axis against the time in months on the x-axis, we will see that the content is likely to continue to decrease a little bit over time. Even though we used the mean values of the 3 batches we examined to form the data points, our examined progression is still only a small sample and if we would examine a larger sample (i.e., many more batches), we would conceivably see more steeply declining progressions. Therefore, in order to represent all possible progressions with a selected probability (à see above confidence level), we span a confidence interval around the decreasing regression line of our data points at a certain time point and thus show the true stability progression. The time at which the lower limit of the confidence interval intersects with the lower specification limit for the content is the maximum shelf life.
Subsequently, the confidence interval placed around the stability data submitted for market authorization could also be used as intervention limits for out-of-trend (OOT) results. This could be done in the context of trending for results obtained in the course of ongoing stability studies. Accordingly, all values that do not lie within the confidence interval would be considered OOT results and corresponding actions would have to be taken.