Here, we discuss normal distribution functions in R, plots, parameter setting, random sampling, density, cumulative distribution and quantiles.
The normal distribution with parameters \(\tt{mean}=\mu\), and \(\tt{standard\;deviation}=\sigma\) has probability density function (pdf) formula as:
\[f(x)={\frac {1}{\sigma {\sqrt {2\pi }}}}e^{-{\frac {1}{2}}\left({\frac {x-\mu }{\sigma }}\right)^{2}},\] for \(x\in\mathbb{R}\),
where \(\mu \in \mathbb {R}\), and \({\sigma>0}\).
\(e\) is \(\tt{Euler's\;number}\) with \(e \approx 2.71828\), and \(\pi\approx 3.14159\).
The mean is \(\mu\), and the variance is \(\sigma^2\).
The standard normal distribution has \(\mu = 0\), and \(\sigma = 1\).
See also probability distributions, test for normal distribution, and plots and charts.
The table below shows the functions for normal distributions in R.
Function | Usage |
rnorm(n, mean=0, sd=1) | Simulate a random sample with \(n\) observations |
dnorm(x, mean=0, sd=1) | Calculate the probability density at the point \(x\) |
pnorm(q, mean=0, sd=1) | Calculate the cumulative distribution at the point \(q\) |
qnorm(p, mean=0, sd=1) | Calculate the quantile value associated with \(p\) |
Below is a plot of the normal distribution function with \(\tt{mean}=3\) and \(\tt{standard\;deviation\;}=1.5\).
x = seq(-3, 9, 1/1000); y = dnorm(x, 3, 1.5)
plot(x, y, type = "l",
xlim = c(-3, 9), ylim = c(0, max(y)),
main = "Probability Density Function of Normal Distribution (3, 1.5)",
xlab = "x", ylab = "Density",
lwd = 2, col = "blue")
# Add line and legend
lines(c(3, 3), c(0, max(y)), col = "red")
legend("topleft", "mean = 3, sd = 1.5",
lwd = 2,
col = "blue",
bty = "n")
Below is a plot of multiple normal distribution functions in one graph.
x1 = seq(-7, 7, 1/1000); y1 = dnorm(x1, 0, 1)
x2 = seq(-7, 7, 1/1000); y2 = dnorm(x2, 0, 2)
x3 = seq(-7, 7, 1/1000); y3 = dnorm(x3, 1, 1)
plot(x1, y1, type = "l",
xlim = c(-7, 7), ylim = range(c(y1, y2, y3)),
main = "Probability Density Functions of Normal Distributions",
xlab = "x", ylab = "Density",
lwd = 2, col = "blue")
points(x2, y2, type = "l", lwd = 2, col = "red")
points(x3, y3, type = "l", lwd = 2, col = "green")
# Add legend
legend("topleft", c("mean = 0, sd = 1",
"mean = 0, sd = 2",
"mean = 1, sd = 1"),
lwd = c(2, 2, 2),
col = c("blue", "red", "green"),
bty = "n")
In the normal distribution functions, parameters are pre-specified as \(\mu=0\) and \(\sigma=1\), hence they do not need to be specified, unless they are to be set to different values.
For example, for pnorm()
, the following are the
same:
# The order of 0 and 1 matters here as the parameter names are not used.
# The first number 0 is mean, and 1 is standard deviation.
pnorm(2); pnorm(2, 0); pnorm(2, 0, 1)
[1] 0.9772499
[1] 0.9772499
[1] 0.9772499
[1] 0.9772499
[1] 0.9772499
Sample 1000 observations from the normal distribution with \(\tt{mean} = 12\) and \(\tt{standard\;deviation\;} = 2\):
set.seed(1234) # Line allows replication (use any number).
sample = rnorm(1000, 12, 2)
hist(sample,
main = "Histogram of 1000 Observations from Normal Distribution
with Mean = 12 and Standard Deviation = 2",
xlab = "x",
col = "deepskyblue", border = "white")
Calculate the density at \(x = 2\), in the normal distribution with \(\tt{mean} = 3\) and \(\tt{standard\;deviation\;} = 1.2\):
[1] 0.2349266
x = seq(-2, 8, 1/1000); y = dnorm(x, 3, 1.2)
plot(x, y, type = "l",
xlim = c(-2, 8), ylim = c(0, max(y)),
main = "Probability Density Function of Normal Distribution
with Mean = 3 and Standard Deviation = 1.2",
xlab = "x", ylab = "Density",
lwd = 2, col = "blue")
# Add lines
segments(2, -1, 2, 0.2349266)
segments(-3, 0.2349266, 2, 0.2349266)
Calculate the cumulative distribution at \(x = 3.2\), in the normal distribution with \(\tt{mean} = 4\) and \(\tt{standard\;deviation\;} = 1.8\). That is, \(P(X \le 3.2)\):
[1] 0.3283606
x = seq(-2, 10, 1/1000); y = pnorm(x, 4, 1.8)
plot(x, y, type = "l",
xlim = c(-2, 10), ylim = c(0,1),
main = "Cumulative Distribution Function of Normal Distribution
with Mean = 4 and Standard Deviation = 1.8",
xlab = "x", ylab = "Cumulative Distribution",
lwd = 2, col = "blue")
# Add lines
segments(3.2, -1, 3.2, 0.3283606)
segments(-3, 0.3283606, 3.2, 0.3283606)
x = seq(-2, 10, 1/1000); y = dnorm(x, 4, 1.8)
plot(x, y, type = "l",
xlim = c(-2, 10), ylim = c(0, max(y)),
main = "Probability Density Function of Normal Distribution
with Mean = 4 and Standard Deviation = 1.8",
xlab = "x", ylab = "Density",
lwd = 2, col = "blue")
# Add shaded region and legend
point = 3.2
polygon(x = c(x[x <= point], point),
y = c(y[x <= point], 0),
col = "limegreen")
legend("topright", c("Area = 0.3283606"),
fill = c("limegreen"),
inset = 0.01)
For upper tail, at \(x = 3.2\), that is, \(P(X \ge 3.2) = 1 - P(X \le 3.2)\), set the "lower.tail" argument:
[1] 0.6716394
x = seq(-2, 10, 1/1000); y = dnorm(x, 4, 1.8)
plot(x, y, type = "l",
xlim = c(-2, 10), ylim = c(0, max(y)),
main = "Shaded Upper Region: Probability Density Function of
Normal Distribution with Mean = 4 and SD = 1.8",
xlab = "x", ylab = "Density",
lwd = 2, col = "blue")
# Add shaded region and legend
point = 3.2
polygon(x = c(point, x[x >= point]),
y = c(0, y[x >= point]),
col = "limegreen")
legend("topright", c("Area = 0.6716394"),
fill = c("limegreen"),
inset = 0.01)
Derive the quantile for \(p = 0.95\), in the normal distribution with \(\tt{mean} = 2.5\) and \(\tt{standard\;deviation\;} = 0.6\). That is, \(x\) such that, \(P(X \le x)=0.95\):
[1] 3.486912
x = seq(0.5, 4.5, 1/1000); y = pnorm(x, 2.5, 0.6)
plot(x, y, type = "l",
xlim = c(0.5, 4.5), ylim = c(0,1),
main = "Cumulative Distribution Function of Normal Distribution
with Mean = 2.5 and Standard Deviation = 0.6",
xlab = "x", ylab = "Cumulative Distribution",
lwd = 2, col = "blue")
# Add lines
segments(3.486912, -1, 3.486912, 0.95)
segments(-1, 0.95, 3.486912, 0.95)
x = seq(0.5, 4.5, 1/1000); y = dnorm(x, 2.5, 0.6)
plot(x, y, type = "l",
xlim = c(0.5, 4.5), ylim = c(0, max(y)),
main = "Probability Density Function of Normal Distribution
with Mean = 2.5 and Standard Deviation = 0.6",
xlab = "x", ylab = "Density",
lwd = 2, col = "blue")
# Add shaded region and legend
point = 3.486912
polygon(x = c(x[x <= point], point),
y = c(y[x <= point], 0),
col = "limegreen")
legend("topright", c("Area = 0.95"),
fill = c("limegreen"),
inset = 0.01)
For upper tail, for \(p = 0.05\), that is, \(x\) such that, \(P(X \ge x)=0.05\):
[1] 3.486912
x = seq(0.5, 4.5, 1/1000); y = dnorm(x, 2.5, 0.6)
plot(x, y, type = "l",
xlim = c(0.5, 4.5), ylim = c(0, max(y)),
main = "Shaded Upper Region: Probability Density Function of
Normal Distribution with Mean = 2.5 and SD = 0.6",
xlab = "x", ylab = "Density",
lwd = 2, col = "blue")
# Add shaded region and legend
point = 3.486912
polygon(x = c(point, x[x >= point]),
y = c(0, y[x >= point]),
col = "limegreen")
legend("topright", c("Area = 0.05"),
fill = c("limegreen"),
inset = 0.01)
The feedback form is a Google form but it does not collect any personal information.
Please click on the link below to go to the Google form.
Thank You!
Go to Feedback Form
Copyright © 2020 - 2024. All Rights Reserved by Stats Codes