Correlation Coefficient of Model

correlation_coefficient(actuals, expecteds, precision=4)

Calculates the correlation coefficient as a way to predict the strength of a predicted model by comparing the ratio of residuals to deviations, in order to determine a strong or weak relationship

Parameters
  • actuals (list of int or float) – List containing the actual values observed from a data set

  • expecteds (list of int or float) – List containing the expected values for a data set based on a predictive model

  • precision (int, default=4) – Maximum number of digits that can appear after the decimal place of the result

Raises
  • TypeError – First and second arguments must be 1-dimensional lists

  • TypeError – Elements of first and second arguments must be integers or floats

  • ValueError – First and second arguments must contain the same number of elements

  • ValueError – Last argument must be a positive integer

Returns

correlation – Number indicating statistical strenght of the relationship between two variables; the closer to 1, the stronger; the closer to 0, the weaker

Return type

float

Notes

  • Observed values: \(y_i = \{ y_1, y_2, \cdots, y_n \}\)

  • Predicted values: \(\hat{y}_i = \{ \hat{y}_1, \hat{y}_2, \cdots, \hat{y}_n \}\)

  • Mean of all observed values: \(\bar{y} = \frac{1}{n}\cdot{\sum\limits_{i=1}^n y_i}\)

  • Residuals: \(e_i = \{ y_1 - \hat{y}_1, y_2 - \hat{y}_2, \cdots, y_n - \hat{y}_n \}\)

  • Deviations: \(d_i = \{ y_1 - \bar{y}, y_2 - \bar{y}, \cdots, y_n - \bar{y} \}\)

  • Sum of squares of residuals: \(SS_{res} = \sum\limits_{i=1}^n e_i^2\)

  • Sum of squares of deviations: \(SS_{dev} = \sum\limits_{i=1}^n d_i^2\)

  • Correlation coefficient: \(r = \sqrt{1 - \frac{SS_{res}}{SS_{dev}}}\)

  • Coefficient of Determination

Examples

Import correlation_coefficient function from regressions library
>>> from regressions.statistics.correlation import correlation_coefficient
Calculate the correlation using the provided actual values [8.2, 9.41, 1.23, 34.7] and the predicted values [7.863, 8.9173, 2.0114, 35.8021]
>>> correlation_short = correlation_coefficient([8.2, 9.41, 1.23, 34.7], [7.863, 8.9173, 2.0114, 35.8021])
>>> print(correlation_short)
0.9983
Calculate the correlation using the provided actual values [2, 3, 5, 7, 11, 13, 17, 19] and the predicted values [1.0245, 3.7157, 6.1398, 8.1199, 12.7518, 14.9621, 15.2912, 25.3182]
>>> correlation_long = correlation_coefficient([2, 3, 5, 7, 11, 13, 17, 19], [1.0245, 3.7157, 6.1398, 8.1199, 12.7518, 14.9621, 15.2912, 25.3182])
>>> print(correlation_long)
0.9011