Covariance

In probability theory and statistics, covariance is a measure of the joint variability of two random variables. The sign of the covariance shows the tendency in the linear relationship between the variables. Covariance is positive when variables tend to show similar behavior and negative when variables tend to show opposite behavior. The magnitude of the covariance is the geometric mean of the variances that are shared for the two random variables, where a larger magnitude means two variables more strongly depend on each other.

Covariance has units of measurement, and the magnitude of the covariance is affected by said units. This means changing the units (e.g., from meters to millimeters) changes the covariance value proportionally, making it difficult to assess the strength of the relationship from the covariance alone. In some situations, it is desirable to compare the strength of the joint association between different pairs of random variables that do not necessarily have the same units. In those situations, we use the correlation coefficient, which normalizes the covariance to a value between -1 and 1 by dividing by the geometric mean of the total variances (i.e., the product of the standard deviations) for the two random variables.

A distinction is made between (1) the covariance of two random variables, which is a population parameter that can be seen as a property of the joint probability distribution, and (2) the sample covariance, which, in addition to serving as a descriptor of the sample, also serves as an estimated value of the population parameter.