Some stat functions in Excel
Overview

I created this workbook is just for fun when enrolled in a bootcamp program. It helped me better understand some of the standard statistical functions used in Excel, notably COVAR, STDEV, CORREL, and RSQL.

In the workbook, I calculated variance, covariance, standard deviation, correlation, and the coefficient of determination (“r-squared”) by using the standard statistical formulas rather than the built-in functions in Excel.

I include this workbook in my portfolio in the event that some other student of statistics finds it useful or perhaps inspires them to do their own base computations. It also demonstrates—admirably in my opinion—useful application of Named Ranges.

Excel file

Click here to download the Excel file.

First worksheet:
Variance and Standard Deviation

The first worksheet features two samples, identified simply as x (A2:A31) and y ( B2:B31). Both are also Named Ranges. There are several formulas which demonstrate how to calculate variance and standard deviation of the samples with and without using Excel’s VAR and STDEV functions.

Second worksheet: Covariance

The second worksheet shows the calculation of covariance between an independent variable (A2:A31) and a dependent variable (B2:B31), appropriately labeled x and y. There are a variety of calculations, all arriving at the same results, with and without Excel’s COVAR functions. The worksheet demonstrates the use of Named Ranges pretty extensively.

Third worksheet:
Correlation

The third worksheet shows the calculation of correlation between an independent variable (A2:A31) and a dependent variable (B2:B31), appropriately labeled x and y. There is one calculation without Excel’s CORREL function and one with.

Fourth worksheet: Charts

In the fourth worksheet, I created several sets of dependent variables based on the single set of independent variables found in cells A2:A31. I then graphed them and compared the values of covariance and correlation among the different sets.

The worksheet includes two charts that show the limitations of using correlation on a polynomial.

Fifth worksheet:
Coefficient of determination

The last worksheet shows how to calculate the coefficient of determination (“r-squared”) where the dependent variable has a polynomial relationship to the independent variable.

It is notable in that Excel’s RSQ function does not properly calculate the coefficient of determination for a polynomial relationship. However, we can still use built-in functionality (rather than calculating r-squared “manually” as in cell I15) by pulling the r-squared value from LINEST.

Excel filesExcel