## Wednesday, February 11, 2009

### Confidence and prediction bands: the shape of things to come

Confidence bands and prediction intervals (or bands) are useful in quantifying the amount of uncertainty associated with model predictions. Quantifying uncertainty has grown in importance with the adoption of a risk based approach consistent with ICH Q8 and Q9, providing a basis for estimating the probability of successful operation at a given set of conditions and thereby defining a design space.

Confidence bands for linear models such as those used in statistics software packages are often ‘u-shaped’ like those shown in Figure 1. Users who are familiar with these plots may expect similarly shaped confidence bands for other types of model. In fact, u-shaped confidence bands are more the exception rather than the rule, as discussed in the knowledge base article available here (login required).

Prediction intervals (or bands) are wider than confidence bands and tend to run more parallel with average responses.

U-shaped confidence bands (indicated in Figure 1 by the blue curves around the best fit line) are observed when fitting to a linear model (y=mx+c) and when the intercept is non-zero (i.e. fitting both the slope and the intercept). In these cases, confidence bands are narrowest at the average value of x (e.g. see reference 1) and expand on either side of this value.

When a linear model of form y=mx is fitted instead, confidence bands are no longer u-shaped, but run as straight lines diverging from the best fit line as x increases, as in Figure 2.

Mechanistic models of most interest for design space and QbD work are initial value problems, where the initial values of responses are known, the independent variable is time and the rates of change of those responses are calculated from ordinary differential equations.

The general procedure for obtaining asymptotic confidence bands for such a non-linear mechanistic model follows the same steps as the two linear cases above: calculation of the gradients of responses with respect to the fitted parameter values and matrix multiplication of these gradients with the covariance matrix of the fitted parameters.

The qualitative behaviour of the confidence bands can be deduced at certain limits without any calculations:
• The initial values for integrated responses are not sensitive to the parameter estimates and therefore confidence bands for these have zero width at time zero, like the case of a linear model with no intercept.
• The values for some integrated responses will become constant at long times, e.g. towards the end of a simulation when all rates of change have dropped to zero. These final values will again not be sensitive at those times to the parameter estimates and therefore the confidence bands will again have zero width.

Figure 3 shows one example of a response from a non-linear mechanistic model of this type, for product formation in a system of competing chemical reactions. This has typical confidence band behaviour for such a profile (confidence band width plotted in green, values on the right hand y-axis):

- zero confidence band width at the start
- maximum confidence band width when the product level is changing rapidly
- almost zero confidence band width at the end, when the reaction is nearly over.

References
1. Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building, George E. P. Box, William G. Hunter, J. Stuart Hunter , John Wiley & Sons, 1978