Kriging Model

This section provides details on the Kriging Model approximation technique used in Isight.

Related Topics
Approximation References

The Kriging model has its roots in the field of geostatistics—a hybrid discipline of mining, engineering, geology, mathematics, and statistics (Cressie, 1993)—and is useful in predicting temporally and spatially correlated data. The Kriging model is named after D. G. Krige, a South African mining engineer who, in the 1950s, developed empirical methods for determining true ore grade distributions from distributions based on sample ore grades (Matheron, 1963). Several texts exist that describe the Kriging model and its usefulness for predicting spatially correlated data (Cressie, 1993) and mining (Journel and Huijbregts, 1978). Kriging meta models are extremely flexible because you can choose from a wide range of correlation functions to build the meta model.

In addition, depending on the correlation function that you choose, the meta model can either “honor the data,” providing an exact interpolation of the data, or “smooth the data,” providing an inexact interpolation (Cressie, 1993).

The following commonly used Kriging correlation functions are available in Isight:

  • Exponential

  • Gaussian

  • Matern Linear

  • Matern Cubic

Kriging postulates a combination of a polynomial model and departures of the following form:

y(x)=f(x)+Z(x)

where y(x) is the unknown function of interest, f(x) is a known polynomial function of x called the trend, and Z(x) is the realization of a stochastic process with mean zero, variance σ 2 , and nonzero covariance. The f(x) term in the previous equation is similar to the polynomial model in a response surface, providing a “global” model of the design space. In many cases f(x) is taken to be a constant term β 0 ; the Isight implementation assumes a constant f(x) term.

While f(x) “globally” approximates the design space, Z(x) creates “localized” deviations so the Kriging model interpolates the ns sampled data points. The covariance matrix of Z(x) that dictates the local deviations is

Cov[Z(xi),Z(xj)]=σ2R([R(xi,xj)])

where R is the correlation matrix, and R(xi,xj) is the correlation function between any two of the ns sampled data points xi and xj. R is a [ns×ns] symmetric, positive definite matrix with ones along the diagonal.

Many different correlation functions exist. The following correlation functions are provided within Isight:

Name

Correlation

Exponential

corr(Xi,Xj)=e-θk|XikXjk|.

Gaussian

corr(Xi,Xj)=e-θk|XikXjk|2

Matern Linear

corr(Xi,Xj)=(1+θk|XikXjk|)e-θk|XikXjk|

Matern Cubic

corr(Xi,Xj)=(1+θk|XikXjk|+12θk2|XikXjk|3)e-θk|XikXjk|

In the above table ndv is the number of design variables and θk are the unknown correlation parameters used to fit the model.

Once the correlation function has been selected and the best θk estimated, the Kriging model can be used to predict the response y(X) at an untried location x using

y^(X)=β^+rT(X)R1(Yfβ^),

where y ^ is the vector of estimated response values at each sample point, f is the vector with values of the trend function evaluated at each sample point, β ^ is a constant, and rT(X) is the vector of correlation values between the untried location x and the sample data points.

The constant β ^ can be estimated using the equation

β^=(fTR1f)1fTR1Y.

The estimate of the variance is

σ2=(Yfβ^)TR1(Yfβ^)ns.

The maximum likelihood estimate (i.e., best) for θk is obtained by maximizing the likelihood estimate given by

nsln(σ2)+ln|R|2

Isight also supports creation of Isotropic approximations with the Kriging model. Isotropic approximation, as the name implies, is used when all independent variables behave similarly; Isight consequently assumes that all θkvalues are identical. Because only one optimum theta value is searched for, isotropic fit is usually faster.

Depending on the number of input parameters, the number of design points, and the number of responses (outputs) of the Kriging model, the model building process can be time consuming. As the size of the matrices increases, the CPU power required for manipulating the matrices grows exponentially. Therefore, generating a good Kriging model that uses many design points can take a substantial amount of time even after all data points are analyzed.

The quality of a Kriging model depends on the location of the sample points in the design space. The Kriging model has been observed to perform best with space-filling designs where sample points are placed far apart. When points are clustered together, the matrices used in fitting the Kriging model become ill-conditioned, resulting in a poor fit. To avoid ill-conditioning, you can filter points from the sample based on distance. All points that are closer than a value called the Smoothing Filter are removed from the sample set before fitting. Isight uses other numerical techniques internally to improve the performance and robustness of the approximation.

Once a Kriging model has been built and deemed sufficiently accurate, it is ready to be used in design analysis. The time taken for fitting a Kriging model is observably larger than that required for other interpolation techniques such as Radial Basis Functions. However, the prediction times are comparable.