Here we show how to get mean, median, and mode in R, including row & column means, medians and modes.

Summary of Functions for Measures of Center: Mean, Median & Mode in R
Function Usage
mean() Calculate mean
colMeans() Calculate column means
rowMeans() Calculate row means
sum() Calculate sum or total
length() Calculate number of observations
colSums() Calculate column sums
rowSums() Calculate row sums
median() Derive median
sort() Sort data observations
unique() Return unique values
Custom function Derive mode

All the functions are from the "base" package, except median() which is from the "stats" package. Both packages come with the base version of R, hence, no installation is needed.

See tests and intervals for statistical tests on sample means and sample medians.

1 Calculate Means and Sums of Rows and Columns in R

Sample mean, \(\bar x\), from a sample with \(n\) observations, and population mean, \(\mu\), from a population of size \(N\).

\[\text{Sample Mean} = \overline{x} = \frac{\sum_{i=1}^{n}x_i}{n}\]

\[\text{Population Mean} = \mu = \frac{\sum_{i=1}^{N}x_i}{N}\]

Enter the data by hand:

Values = c(18, 10, 17, 12, 14, 12, 10, 10, 18, 15)
mean(Values)
[1] 13.6
sum(Values)/length(Values)
[1] 13.6

Using stackloss data from the "datasets" package.

Sample rows from stackloss:

stackloss
   Air.Flow Water.Temp Acid.Conc. stack.loss
1        80         27         89         42
3        75         25         90         37
6        62         23         87         18
9        58         23         87         15
13       58         18         82         11
15       50         18         89          8
17       50         19         72          8
19       50         20         80          9
20       56         20         82         15
21       70         20         91         15

Calculate the means of the columns in a dataframe:

colMeans(stackloss)
  Air.Flow Water.Temp Acid.Conc. stack.loss 
  60.42857   21.09524   86.28571   17.52381 
# Or:
apply(stackloss, MARGIN = 2, FUN = mean)
  Air.Flow Water.Temp Acid.Conc. stack.loss 
  60.42857   21.09524   86.28571   17.52381 

Calculate the means of the rows in a dataframe:

rowMeans(stackloss)
 [1] 59.50 58.00 56.75 50.25 47.25 47.50 49.50 49.75 45.75 42.50 44.75 44.00
[13] 42.25 45.50 41.25 40.25 37.25 39.00 39.75 43.25 49.00
# Or:
apply(stackloss, MARGIN = 1, FUN = mean)
 [1] 59.50 58.00 56.75 50.25 47.25 47.50 49.50 49.75 45.75 42.50 44.75 44.00
[13] 42.25 45.50 41.25 40.25 37.25 39.00 39.75 43.25 49.00

Calculate the sums of the columns in a dataframe:

colSums(stackloss)
  Air.Flow Water.Temp Acid.Conc. stack.loss 
      1269        443       1812        368 
# Or:
apply(stackloss, MARGIN = 2, FUN = sum)
  Air.Flow Water.Temp Acid.Conc. stack.loss 
      1269        443       1812        368 

Calculate the sums of the rows in a dataframe:

rowSums(stackloss)
 [1] 238 232 227 201 189 190 198 199 183 170 179 176 169 182 165 161 149 156 159
[20] 173 196
# Or:
apply(stackloss, MARGIN = 1, FUN = sum)
 [1] 238 232 227 201 189 190 198 199 183 170 179 176 169 182 165 161 149 156 159
[20] 173 196

2 Derive Median and Medians of Columns in R

Notice that the median is the average of the middle numbers 12 and 14 when Values is sorted, since there are an even number of observations.

Values
 [1] 18 10 17 12 14 12 10 10 18 15
sort(Values)
 [1] 10 10 10 12 12 14 15 17 18 18
# Derive median.
median(Values)
[1] 13

Aggregate the medians of the columns in a dataframe:

See the stackloss data above:

apply(stackloss, MARGIN = 2, FUN = median)
  Air.Flow Water.Temp Acid.Conc. stack.loss 
        58         20         87         15 

Aggregate the medians of the rows in a dataframe:

See the stackloss data above:

apply(stackloss, MARGIN = 1, FUN = median)
 [1] 61.0 58.5 56.0 45.0 42.0 42.5 43.0 43.0 40.5 38.0 38.0 37.5 38.0 38.5 34.0
[16] 34.0 34.5 34.5 35.0 38.0 45.0

3 Derive Mode and Modes of Columns in R

Here, we provide a simple function based on the table() function to derive the frequency distribution of the unique values in order from smallest to largest. Then finally, the sort() and unique() functions to list the sorted unique value(s) that have the maximum frequency.

The modeval() function for deriving mode:

modeval = function(x){
  position = which(table(x)==max(table(x)))
  sort(unique(x))[position]
}

Examples for numeric observations:

Values1 = c(4, 6, -3, -8, -2, 6, 9, -8, 3, 1, 10)
modeval(Values1)
[1] -8  6
Values2 = c(4, 6, -3, -8, -2, 6, 9, -8, 3, 1, 10, 4, 4)
modeval(Values2)
[1] 4

Examples for character observations:

Values3 = c("C", "B", "B", "A", "A", "C", "A", "C")
modeval(Values3)
[1] "A" "C"
Values4 = c("C", "B", "B", "A", "A", "C", "A", "C", "B", "B")
modeval(Values4)
[1] "B"

Aggregate the modes of the columns in a dataframe:

See the stackloss data and modeval() function above:

apply(stackloss, MARGIN = 2, FUN = modeval)
$Air.Flow
[1] 58

$Water.Temp
[1] 18

$Acid.Conc.
[1] 87

$stack.loss
[1]  8 15

Aggregate the modes of the rows in a dataframe:

See the stackloss data and the modeval() function above:

apply(stackloss, MARGIN = 1, FUN = modeval)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
[1,]   27   27   25   24   18   18   19   20   15    14    14    13    11    12
[2,]   42   37   37   28   22   23   24   24   23    18    18    17    18    19
[3,]   80   80   75   62   62   62   62   62   58    58    58    58    58    58
[4,]   89   88   90   87   87   87   93   93   87    80    89    88    82    93
     [,15] [,16] [,17] [,18] [,19] [,20] [,21]
[1,]     8     7     8     8     9    15    15
[2,]    18    18    19    19    20    20    20
[3,]    50    50    50    50    50    56    70
[4,]    89    86    72    79    80    82    91

The outcome shows multiple results for each row as no number is repeated in any row.

Copyright © 2020 - 2024. All Rights Reserved by Stats Codes