Pandas DataFrame | cov method
Start your free 7-days trial now!
Pandas DataFrame.cov(~)
method computes the covariance matrix of the columns in the source DataFrame. Note that the unbiased estimator of the covariance is used:
Where,
is the number of values in a column is the sample mean of column is the sample mean of column and are the th value in the column and respectively.
All NaN
values are ignored.
Parameters
1. min_periods
link | int
| optional
The minimum number of non-NaN
values to compute the covariance.
Return Value
A DataFrame
that represents the covariance matrix of the values in the source DataFrame.
Examples
Basic usage
Consider the following DataFrame:
filter_none
Copy
df
A B0 2 31 4 42 6 5
To compute the covariance of two columns:
filter_none
Copy
df.cov()
A BA 4.0 2.0B 2.0 1.0
Here, we get the following results:
the sample covariance of columns
A
andB
is2.0
.the sample variance of column
A
is4.0
and that of columnB
is1.0
.
Specifying min_periods
Consider the following DataFrame with some missing values:
filter_none
Copy
df
A B0 3.0 5.01 NaN 6.02 4.0 7.0
Setting min_periods=3
yields:
filter_none
Copy
df.cov(min_periods=3)
A BA NaN NaNB NaN 1.0
Here, the reason why we get NaN
is that, since the method ignores NaN
, column A
only has 2 values. Since we've set the minimum threshold to compute the covariance to be 3
, we end up with a DataFrame filled with NaN
.