Correlation Coefficient of Model¶
- correlation_coefficient(actuals, expecteds, precision=4)¶
Calculates the correlation coefficient as a way to predict the strength of a predicted model by comparing the ratio of residuals to deviations, in order to determine a strong or weak relationship
- Parameters
actuals (list of int or float) – List containing the actual values observed from a data set
expecteds (list of int or float) – List containing the expected values for a data set based on a predictive model
precision (int, default=4) – Maximum number of digits that can appear after the decimal place of the result
- Raises
TypeError – First and second arguments must be 1-dimensional lists
TypeError – Elements of first and second arguments must be integers or floats
ValueError – First and second arguments must contain the same number of elements
ValueError – Last argument must be a positive integer
- Returns
correlation – Number indicating statistical strenght of the relationship between two variables; the closer to 1, the stronger; the closer to 0, the weaker
- Return type
float
See also
Notes
Observed values: \(y_i = \{ y_1, y_2, \cdots, y_n \}\)
Predicted values: \(\hat{y}_i = \{ \hat{y}_1, \hat{y}_2, \cdots, \hat{y}_n \}\)
Mean of all observed values: \(\bar{y} = \frac{1}{n}\cdot{\sum\limits_{i=1}^n y_i}\)
Residuals: \(e_i = \{ y_1 - \hat{y}_1, y_2 - \hat{y}_2, \cdots, y_n - \hat{y}_n \}\)
Deviations: \(d_i = \{ y_1 - \bar{y}, y_2 - \bar{y}, \cdots, y_n - \bar{y} \}\)
Sum of squares of residuals: \(SS_{res} = \sum\limits_{i=1}^n e_i^2\)
Sum of squares of deviations: \(SS_{dev} = \sum\limits_{i=1}^n d_i^2\)
Correlation coefficient: \(r = \sqrt{1 - \frac{SS_{res}}{SS_{dev}}}\)
Examples
- Import correlation_coefficient function from regressions library
>>> from regressions.statistics.correlation import correlation_coefficient
- Calculate the correlation using the provided actual values [8.2, 9.41, 1.23, 34.7] and the predicted values [7.863, 8.9173, 2.0114, 35.8021]
>>> correlation_short = correlation_coefficient([8.2, 9.41, 1.23, 34.7], [7.863, 8.9173, 2.0114, 35.8021]) >>> print(correlation_short) 0.9983
- Calculate the correlation using the provided actual values [2, 3, 5, 7, 11, 13, 17, 19] and the predicted values [1.0245, 3.7157, 6.1398, 8.1199, 12.7518, 14.9621, 15.2912, 25.3182]
>>> correlation_long = correlation_coefficient([2, 3, 5, 7, 11, 13, 17, 19], [1.0245, 3.7157, 6.1398, 8.1199, 12.7518, 14.9621, 15.2912, 25.3182]) >>> print(correlation_long) 0.9011