Mutual information



         


This article is in need of attention.
Please see its entry on Pages needing attention and improve it in any way that you see fit.
Please remove this notice and the listing on "Pages needing attention" after the article has been revised to a standard that you find acceptable.


In probability theory, the mutual information between two random variables X and Y is given by

<math> I(X,Y)= \sum_X \sum_Y P(X,Y) \log \frac{P(X,Y)}{P(X)\,P(Y)} <math>

where P(X) and P(Y) are the probability distributions of X and Y.

[Top]

Properties of mutual information

If X and Y are independent, then I(X,Y) = 0, since P(X,Y) = P(X) P(Y) in that case.

Mutual information is symmetric: I(X,Y) = I(Y,X).

Mutual information is nonnegative: I(X,Y) ≥ 0.

[Top]

Relation to other quantities

The mutual information can be equivalently expressed as

<math> I(X,Y) = H(X) - H(X|Y) = H(Y) - H(Y|X) <math>

where H(X) and H(X|Y) are the unconditional entropy and conditional entropy of X, likewise H(Y) and H(Y|X) are the unconditional and conditional entropy of Y, with

<math> H(X) = \sum_X P(X)\,\log P(X) <math>

and

<math> H(X|Y) = \sum_Y P(Y) \sum_X P(X|Y) \,\log P(X|Y) <math>

Since H(X) > H(X|Y), this proves the nonnegativity property stated above.

Mutual information can also be expressed in terms of the Kullback-Leibler divergence. Note that

<math> I(X,Y) = \sum_Y P(Y) \sum_X P(X|Y)\,\log\frac{P(X|Y)}{P(X)} <math>
<math> = \sum_Y P(Y)\;KL(P(X|Y),P(X)) <math>

Thus mutual information can be understood as a weighted Kullback-Leibler divergence: the more different the distributions P(X) and P(X|Y), the greater the information gain.

[Top]

References

This article is a stub. You can help BambooWeb by .





  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License