Here we show how to get both sample and population standard deviation, variance, and range in R, including by columns in a dataframe.
Function | Usage |
sd() |
Calculate sample standard deviation |
sd()*sqrt((N-1)/N) |
Calculate population standard deviation |
var() |
Calculate sample variance |
var()*(N-1)/N |
Calculate population variance |
range() |
Derive data range |
All functions come with the "stats" or "base" package in the base version of R, hence, no installation is needed.
See tests and intervals for statistical tests on sample variances.
Enter the data by hand:
Sample standard deviation:
[1] 5.618422
The sample standard deviation for \(n\) observation here is defined as:
\[\text{Sample Std. Dev.} = s = \sqrt{{\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}}\] with \[\overline{x} = \frac{\sum_{i=1}^{n}x_i}{n}.\]
Enter the data by hand:
Population standard deviation:
[1] 5.330103
The population standard deviation for a population of size \(N\) here is defined as:
\[\text{Population Std. Dev.} = \sigma = \sqrt{{\frac{\sum_{i=1}^{N}(x_i-\mu)^2}{N}}}\] with \[\mu = \frac{\sum_{i=1}^{N}x_i}{N}.\]
Using the faithful data from the "datasets" package.
Sample rows from faithful:
eruptions waiting
1 3.600 79
35 3.833 74
70 4.700 73
83 4.100 70
92 4.333 90
122 4.067 69
137 1.883 51
175 4.167 81
248 4.367 82
272 4.467 74
Calculate the sample standard deviation of the columns in the dataframe:
eruptions waiting
1.141371 13.594974
Calculate the population standard deviation of the columns in the dataframe:
# Population standard deviation function:
pop.sd = function(x){
N = length(x)
sd(x)*sqrt((N-1)/N)
}
# Results by columns
sapply(faithful, FUN = pop.sd)
eruptions waiting
1.139271 13.569960
Enter the data by hand:
Sample variance:
[1] 37.78889
The sample variance for \(n\) observations here is defined as:
\[\text{Sample Variance} = s^2 = {\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}\] with \[\overline{x} = \frac{\sum_{i=1}^{n}x_i}{n}.\]
Enter the data by hand:
Population variance:
[1] 34.01
The population variance for a population of size \(N\) here is defined as:
\[\text{Population Variance} = \sigma^2 = {\frac{\sum_{i=1}^{N}(x_i-\mu)^2}{N}}\] with \[\mu = \frac{\sum_{i=1}^{N}x_i}{N}.\]
See the faithful data above:
Calculate the sample variances of the columns in the dataframe:
eruptions waiting
1.302728 184.823312
Calculate the population variances of the columns in the dataframe:
# Population variance function:
pop.var = function(x){
N = length(x)
var(x)*((N-1)/N)
}
# Results by columns
sapply(faithful, FUN = pop.var)
eruptions waiting
1.297939 184.143815
For \(n\) data points or a population of size \(n\):
\[\text{Range} = \max(x_1, x_2,...,x_n) - \min(x_1, x_2,...,x_n).\] Enter the data by hand:
Values = c(53, 58, 47, 55, 44, 48, 49, 65, 47, 40)
# The range function gives the min and max values.
range(Values)
[1] 40 65
[1] 40 65
[1] 25
[1] 25
See the faithful data above:
Calculate the ranges of the columns in the dataframe:
eruptions waiting
[1,] 3.5 53
The feedback form is a Google form but it does not collect any personal information.
Please click on the link below to go to the Google form.
Thank You!
Go to Feedback Form
Copyright © 2020 - 2024. All Rights Reserved by Stats Codes