1 Simple Linear Regression
2 Multiple Linear Regression
3 Logistic Regression
4 Zero Intercept Regression
5 Regression with Categorical Independent (Explanatory) Variables
6 Generalized Linear Model (GLM)
7 Binomial Family GLM
8 Stepwise Regression
9 Correlation and Covariance

This page summarizes some of the most common regression methods or topics which are covered in this section.

1 Simple Linear Regression

Click to learn more

\(y = \alpha + \beta x + \varepsilon,\)

where \(\varepsilon\) represents the error terms, and to predict or estimate the true \(y\), as \(\widehat y\), use the equation below:

\(\widehat y = \widehat \alpha + \widehat \beta x.\)

y = c(8.9, 7.7, 9.9, 7.6, 7.1, 6.2, 6.6, 7.4)
x = c(3.7, 3.2, 4.6, 3.3, 3.6, 2.7, 2.8, 3.0)
df_data = data.frame(y, x)
df_data

model = lm(y ~ x)
summary(model)

model = lm(y ~ x, data = df_data)
summary(model)

2 Multiple Linear Regression

Click to learn more

\(y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_p x_p + \varepsilon,\)

where \(\varepsilon\) represents the error terms, and to predict or estimate the true \(y\), as \(\widehat y\), use the equation below:

\(\widehat y = \widehat \beta_0 + \widehat \beta_1 x_1 + \widehat \beta_2 x_2 + \cdots + \widehat \beta_p x_p.\)

y = c(4.7, 2.4, 1.6, 6.2, 3.7, 9.2, 2.8, 2.9)
x1 = c(6.4, 3.9, 5.1, 7.2, 5.2, 5.4, 5.2, 6.2)
x2 = c(6.0, 4.4, 5.9, 4.4, 6.5, 3.0, 6.5, 7.1)
x3 = c(7.1, 6.6, 6.8, 5.7, 7.7, 7.5, 8.6, 6.7)
df_data = data.frame(y, x1, x2, x3)
df_data

model = lm(y ~ x1 + x2 + x3)
summary(model)

model = lm(y ~ x1 + x2 + x3, data = df_data)
summary(model)

model = lm(y ~ ., data = df_data)
summary(model)

3 Logistic Regression

Click to learn more

\(\log \left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_m x_m + \varepsilon,\)

where \(\varepsilon\) represents the error terms. The logistic regression model then estimates the true coefficients which are used to predict or estimate the true \(p\), as \(\widehat p\), with the equation below:

\(\widehat p = \frac{\exp\left( \widehat \beta_0 + \widehat \beta_1 x_1 + \widehat \beta_2 x_2 + \cdots + \widehat \beta_m x_m \right)}{1 + \exp\left( \widehat \beta_0 + \widehat \beta_1 x_1 + \widehat \beta_2 x_2 + \cdots + \widehat \beta_m x_m \right)}.\)

y = c(1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0)
x1 = c(5.10, 4.78, 5.04, 5.87, 5.99, 3.62,
5.49, 4.58, 5.63, 2.87, 7.52, 5.01)
x2 = c(2.81, 3.31, 2.09, 3.08, 3.47, 2.65,
3.85, 2.48, 3.61, 2.75, 1.18, 3.29)
lgr_data = data.frame(y, x1, x2)
lgr_data

model = glm(y ~ x1 + x2, family = binomial())
summary(model)

model = glm(y ~ x1 + x2, family = binomial(link = "logit"),
data = lgr_data)
summary(model)

4 Zero Intercept Regression

Click to learn more

\(y = \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_p x_p + \varepsilon,\)

where \(\varepsilon\) represents the error terms, and to predict or estimate the true \(y\), as \(\widehat y\), use the equations below:

\(\widehat y = \widehat \beta_1 x_1 + \widehat \beta_2 x_2 + \cdots + \widehat \beta_p x_p.\)

x = c(4.5, 3.9, 4.2, 3.1, 2.7, 5.4, 5.3, 4.6)
y = c(8.3, 7.2, 8.3, 7.1, 5.5, 9.8, 8.7, 7.5)
df_data = data.frame(y, x)
df_data

model = lm(y ~ -1 + x)
summary(model)

model = lm(y ~ -1 + x, data = df_data)
summary(model)

5 Regression with Categorical Independent (Explanatory) Variables

Click to learn more

\(y = \beta_0 + \beta_1 I_{\{ grp \; 1\}} + \beta_2 I_{\{grp \; 2\}} + \cdots + \beta_p I_{\{grp \; p\}} + \varepsilon,\)

where \(\varepsilon\) represents the error terms, and \(I_{\{ grp \; x\}}=1\) when estimating \(y\) for group \(x\), and \(I_{\{ grp \; x\}}=0\) otherwise.

Then \(\widehat \beta_0\) is the estimate for the base group (intercept group), and each alternate group \(x\) is estimated with \(\widehat \beta_0 + \widehat \beta_x.\)

y = c(4.7, 4.8, 4.1, 4.7, 5.4, 5.7, 7.4, 6.8)
x = c("A", "A", "B", "B", "B", "C", "C", "C")
df_data = data.frame(y, x)
df_data

model = lm(y ~ x)
summary(model)

model = lm(y ~ x, data = df_data)
summary(model)

6 Generalized Linear Model (GLM)

This discusses the generalized linear model (GLM) in R with interpretations, including, the binomial, Gaussian, Poisson, and gamma families.

Click to learn more

7 Binomial Family GLM

This discusses the binomial family GLM in R with interpretations, and link functions including, the logit, probit, cauchit, log, and cloglog links.

Click to learn more

8 Stepwise Regression

This discusses the stepwise regression in R, including, the forward, backward, and bi-directional (or forward-backward) steps.

Click to learn more

9 Correlation and Covariance

This discusses the Pearson’s correlation coefficient, and covariance, including deriving their values and matrices in R.

Click to learn more

Regression in R

1 Simple Linear Regression

2 Multiple Linear Regression

3 Logistic Regression

4 Zero Intercept Regression

5 Regression with Categorical Independent (Explanatory) Variables

6 Generalized Linear Model (GLM)

7 Binomial Family GLM

8 Stepwise Regression

9 Correlation and Covariance