Recent Articles



































Multivariate normal distribution



         


In probability theory and statistics, a random vector X = (X1, ..., Xn) follows a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution (in honor of Carl Friedrich Gauss, who was not the first to write about the normal distribution), if it satisfies the following equivalent conditions:

φX(u)=exp(iμTu − (½) uT Γ u).

The following is not quite equivalent to the conditions above, since it fails to allow for a singular matrix as the variance:

<math>

f_X(x_1,\ldots,x_n)\, dx_1\ldots dx_n= \frac{1}{(2\pi)^{n/2}|\Sigma|^{1/2}} \exp\left(-\frac{1}{2}({\mathbf x}-{\mathbf\mu})^T{\mathbf\Sigma}^{-1}({\mathbf x}-{\mathbf\mu}) \right)dx_1\ldots dx_n <math>

where <math>\left|A\right|<math> is the determinant of <math>A<math>. Note how the equation above reduces to that of the univariate normal distribution if <math>\Sigma<math> is a <math>1\times 1<math> matrix (ie a real number).

The vector μ in these conditions is the expected value of X and the matrix <math>{\mathbf\Sigma}={\mathbf A}{\mathbf A}^T<math> is the covariance matrix of the components Xi. It is important to realize that the covariance matrix must be allowed to be singular. That case arises frequently in statistics; for example, in the distribution of the vector of residuals in ordinary linear regression problems. Note also that the Xi are in general not independent; they can be seen as the result of applying the linear transformation A to a collection of independent Gaussian variables Z.

[Top]

Linear transformation

If <math>{\mathbf y}={\mathbf B}{\mathbf x}<math> is a linear transformation of <math>{\mathbf x}<math> where <math>{\mathbf B}<math> is a rank <math>m<math> <math>m\times p<math> matrix with <math>m\leq p<math> then <math>{\mathbf y}<math> has a multivariate normal distribution with a mean of <math>{\mathbf B}{\mathbf\mu}<math> and a covariance matrix <math>{\mathbf B}{\mathbf\Sigma}{\mathbf B}^T<math>.

Corollary: any subset of the <math>x_i<math> has a marginal distribution that is also multivariate normal. To see this consider the following example: to extract the subset <math>(x_1,x2,x_4)^T<math>, use

<math>

{\mathbf B}= \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & \ldots & 0\\ 0 & 1 & 0 & 0 & 0 & \ldots & 0\\ 0 & 0 & 0 & 1 & 0 & \ldots & 0 \end{bmatrix} <math> which extracts the desired elements directly.

[Top]

Marginal distributions

If <math>{\mathbf x}<math> is partitioned into <math>{\mathbf x}_1<math> and <math>{\mathbf x}_2<math> (so <math>{\mathbf x}=({\mathbf x}_1,{\mathbf x}_2)^T<math> (note that vectors are column vectors by default). Say <math>{\mathbf x}_1<math> has <math>q<math> elements, so <math>{\mathbf x}_2<math> has <math>p-q<math> elements.

[Top]

Conditional distributions

Then if <math>{\mathbf\mu}<math> and <math>{\mathbf\Sigma}<math> are partitioned as follows

<math>

{\mathbf\mu}=\left(\begin{matrix} {\mathbf\mu}_1\\ {\mathbf\mu}_2 \end{matrix} \right) \qquad {\mathbf\Sigma}= \begin{bmatrix} {\mathbf\Sigma}_{11} & {\mathbf\Sigma}_{12} \\ {\mathbf\Sigma}_{21} & {\mathbf\Sigma}_{22} \end{bmatrix} <math>

then the distribution of <math>{\mathbf x}_1<math> conditional on <math>{\mathbf x}_2={\mathbf a}<math> is multivariate normal with mean

<math>

{\mathbf\mu}_1+{\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}\left({\mathbf a}-{\mathbf\mu}_2\right)<math>

and covariance matrix

<math>

{\mathbf\Sigma}_{11}- {\mathbf\Sigma}_{12} {\mathbf\Sigma}_{22}^{-1} {\mathbf\Sigma}_{21}. <math>

This matrix is the Schur complement of <math>{\mathbf\Sigma_{22}}<math> in <math>{\mathbf\Sigma}<math>.

Note that knowing the value of <math>{\mathbf x}_2<math> to be <math>{\mathbf a}<math> alters the variance; perhaps more suprisingly, the mean is shifted by <math>{\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}\left({\mathbf a}-{\mathbf\mu}_2\right)<math>; compare this with the situation of not knowing the value of <math>{\mathbf a}<math>, in which case <math>{\mathbf x}_1<math> would have distribution <math>N_q\left({\mathbf\mu}_1,{\mathbf\Sigma}_{11}\right)<math>.

The matrix <math>{\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}<math> is known as the matrix of regression coefficients.

[Top]

Estimation of parameters

The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is perhaps surprisingly subtle and elegant. See estimation of covariance matrices.






  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License