Here we show how to get both sample and population standard deviation, variance, and range in R, including by columns in a dataframe.

Summary of Functions for Measures of Variation in R
Function Usage
sd() Calculate sample standard deviation
sd()*sqrt((N-1)/N) Calculate population standard deviation
var() Calculate sample variance
var()*(N-1)/N Calculate population variance
range() Derive data range

All functions come with the "stats" or "base" package in the base version of R, hence, no installation is needed.

See tests and intervals for statistical tests on sample variances.

1 Calculate Sample Standard Deviation and Population Standard Deviation

Calculate Sample Standard Deviation:

Enter the data by hand:

Values = c(11, 7, 4, 6, 3, 20, 7, 12, 17, 6)

Sample standard deviation:

sd(Values)
[1] 5.618422

The sample standard deviation for \(n\) observation here is defined as:

\[\text{Sample Std. Dev.} = s = \sqrt{{\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}}\] with \[\overline{x} = \frac{\sum_{i=1}^{n}x_i}{n}.\]

Calculate Population Standard Deviation:

Enter the data by hand:

Values = c(11, 7, 4, 6, 3, 20, 7, 12, 17, 6)

Population standard deviation:

N = length(Values)
sd(Values)*sqrt((N-1)/N)
[1] 5.330103

The population standard deviation for a population of size \(N\) here is defined as:

\[\text{Population Std. Dev.} = \sigma = \sqrt{{\frac{\sum_{i=1}^{N}(x_i-\mu)^2}{N}}}\] with \[\mu = \frac{\sum_{i=1}^{N}x_i}{N}.\]

Dataframe Examples:

Using the faithful data from the "datasets" package.

Sample rows from faithful:

faithful
    eruptions waiting
1       3.600      79
35      3.833      74
70      4.700      73
83      4.100      70
92      4.333      90
122     4.067      69
137     1.883      51
175     4.167      81
248     4.367      82
272     4.467      74

Calculate the sample standard deviation of the columns in the dataframe:

sapply(faithful, FUN = sd)
eruptions   waiting 
 1.141371 13.594974 

Calculate the population standard deviation of the columns in the dataframe:

# Population standard deviation function:
pop.sd = function(x){
  N = length(x)
  sd(x)*sqrt((N-1)/N)
}

# Results by columns
sapply(faithful, FUN = pop.sd)
eruptions   waiting 
 1.139271 13.569960 

2 Calculate Sample Variance and Population Variance

Calculate Sample Variance:

Enter the data by hand:

Values = c(40, 33, 22, 33, 41, 25, 28, 34, 27, 30)

Sample variance:

var(Values)
[1] 37.78889

The sample variance for \(n\) observations here is defined as:

\[\text{Sample Variance} = s^2 = {\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}\] with \[\overline{x} = \frac{\sum_{i=1}^{n}x_i}{n}.\]

Calculate Population Variance:

Enter the data by hand:

Values = c(40, 33, 22, 33, 41, 25, 28, 34, 27, 30)

Population variance:

N = length(Values)
var(Values)*((N-1)/N)
[1] 34.01

The population variance for a population of size \(N\) here is defined as:

\[\text{Population Variance} = \sigma^2 = {\frac{\sum_{i=1}^{N}(x_i-\mu)^2}{N}}\] with \[\mu = \frac{\sum_{i=1}^{N}x_i}{N}.\]

Dataframe Examples:

See the faithful data above:

Calculate the sample variances of the columns in the dataframe:

sapply(faithful, FUN = var)
 eruptions    waiting 
  1.302728 184.823312 

Calculate the population variances of the columns in the dataframe:

# Population variance function:
pop.var = function(x){
  N = length(x)
  var(x)*((N-1)/N)
}

# Results by columns
sapply(faithful, FUN = pop.var)
 eruptions    waiting 
  1.297939 184.143815 

3 Calculate Range

For \(n\) data points or a population of size \(n\):

\[\text{Range} = \max(x_1, x_2,...,x_n) - \min(x_1, x_2,...,x_n).\] Enter the data by hand:

Values = c(53, 58, 47, 55, 44, 48, 49, 65, 47, 40)

# The range function gives the min and max values.
range(Values)
[1] 40 65
# Or:
c(min(Values), max(Values))
[1] 40 65
# Calculate range as the differences of the two.
diff(range(Values))
[1] 25
# Or:
max(Values) - min(Values)
[1] 25

See the faithful data above:

Calculate the ranges of the columns in the dataframe:

diff(sapply(faithful, FUN = range))
     eruptions waiting
[1,]       3.5      53

Copyright © 2020 - 2024. All Rights Reserved by Stats Codes