Here, we will introduce the main data structures in R, which include vectors, matrices, arrays, dataframes, and lists.

These objects are among the primary instruments for statistical testing and modeling (such as regression) in R. Several data operations and mathematical operations can also be performed on them.

The data objects in R have two facets, the nature of their composition, and dimension. They have 1, 2 or 3 or more dimensions and are either composed of homogeneous (similar) data types or heterogeneous (differing) data types.

Summary of the Main Data Structures in R
Comp.\Dim. 1 2 3 or more
Homogeneous Vector, Matrix, Array Matrix, Array Array
Heterogeneous Dataframe, List* Dataframe, List* List*
  • A list can have multiple levels when it is a list of lists, or it contains a list of lists itself.

1 Vectors

A vector contains homogeneous data type and it is 1-dimensional.

For example:

vec1 = c(1, 2, 4, 8, 16)
vec1
[1]  1  2  4  8 16
vec2 = c("a", "b", "c")
vec2
[1] "a" "b" "c"

2 Matrices

A matrix contains homogeneous data type and it has 1 or 2 dimensions. It has rows and columns that are numbered or can be named.

For example:

mat1 = matrix(c(1, 2, 4, 8, 
                16, 32, 64, 128,
                256, 512, 1024, 2048),
              nrow = 3, byrow = TRUE)
mat1
     [,1] [,2] [,3] [,4]
[1,]    1    2    4    8
[2,]   16   32   64  128
[3,]  256  512 1024 2048
mat2 = matrix(c(c("a", "b", "c"), 
                c("x", "y", "z")), 
              ncol = 2, byrow = FALSE)
# Name the rows and columns
rownames(mat2) = c("A1", "A2", "A3")
colnames(mat2) = c("B1", "B2")
mat2
   B1  B2 
A1 "a" "x"
A2 "b" "y"
A3 "c" "z"

3 Arrays

An array contains homogeneous data type and it can have 1 or more dimensions. Its dimensions are numbered or can be named.

Create a 3 dimensional array:

arr1 = array(c("a", "b", "c", "d", "e", "f", 
               "g", "h", "i", "j", "k", "l"),
             dim = c(2, 3, 2))
arr1
, , 1

     [,1] [,2] [,3]
[1,] "a"  "c"  "e" 
[2,] "b"  "d"  "f" 

, , 2

     [,1] [,2] [,3]
[1,] "g"  "i"  "k" 
[2,] "h"  "j"  "l" 
# Name the rows and columns
colnames(arr1) = c("A1", "A2", "A3")
rownames(arr1) = c("B1", "B2")
arr1
, , 1

   A1  A2  A3 
B1 "a" "c" "e"
B2 "b" "d" "f"

, , 2

   A1  A2  A3 
B1 "g" "i" "k"
B2 "h" "j" "l"

Create a 4 dimensional array:

arr2 = array(c(1, 2, 4, 8,
               16, 32, 64, 128,
               256, 512, 1024, 2048,
               4096, 8192, 16384, 32768),
             dim = c(2, 2, 2, 2))
arr2
, , 1, 1

     [,1] [,2]
[1,]    1    4
[2,]    2    8

, , 2, 1

     [,1] [,2]
[1,]   16   64
[2,]   32  128

, , 1, 2

     [,1] [,2]
[1,]  256 1024
[2,]  512 2048

, , 2, 2

     [,1]  [,2]
[1,] 4096 16384
[2,] 8192 32768

4 Dataframes

A dataframe can contain homogeneous or heterogeneous data types, it is 2-dimensional with rows and columns that are numbered or can be named.

dtfrm1 = data.frame(Team = c("A", "B", "B", "C", "D"), 
                   Score = c(9, 8, 8, 10, 7), 
                   Position = c(2, 3, 3, 1, 5))
dtfrm1
  Team Score Position
1    A     9        2
2    B     8        3
3    B     8        3
4    C    10        1
5    D     7        5
# Name the rows
rownames(dtfrm1) = c("R1", "R2", "R3", "R4", "R5")
dtfrm1
   Team Score Position
R1    A     9        2
R2    B     8        3
R3    B     8        3
R4    C    10        1
R5    D     7        5

5 Lists

A list can contain homogeneous or heterogeneous data types which includes vectors, matrices, arrays, dataframes and other lists (or list of lists). It can have 1 or more levels with each level numbered or can be named.

ls1 = list(vec1, mat1, "array" = arr1, "dataframe" = dtfrm1)
ls1
[[1]]
[1]  1  2  4  8 16

[[2]]
     [,1] [,2] [,3] [,4]
[1,]    1    2    4    8
[2,]   16   32   64  128
[3,]  256  512 1024 2048

$array
, , 1

   A1  A2  A3 
B1 "a" "c" "e"
B2 "b" "d" "f"

, , 2

   A1  A2  A3 
B1 "g" "i" "k"
B2 "h" "j" "l"


$dataframe
   Team Score Position
R1    A     9        2
R2    B     8        3
R3    B     8        3
R4    C    10        1
R5    D     7        5

Copyright © 2020 - 2024. All Rights Reserved by Stats Codes