Here, we discuss the paired samples Wilcoxon signed-rank test in R with interpretations, including, test statistics, p-values, and confidence intervals.
The paired samples Wilcoxon signed-rank test in R can be performed
with the wilcox.test()
function from the base "stats" package.
The paired samples Wilcoxon signed-rank test, with the assumption that the paired differences have a symmetric distribution, can be used to test whether the median of the differences between paired values from the two populations where two dependent samples come from is equal to a certain value (which is stated in the null hypothesis) or not. It is a non-parametric alternative to the paired samples t-test.
In the paired samples Wilcoxon signed-rank test, the test statistic is based on the signs of ranks. It is the sum of the ranks of the observed paired differences that are greater than the null hypothesis median, where the ranks are based on the distances of the observed paired differences to the null hypothesis median.
Question | Is the median of paired x and y differences equal to \(m_0\)? | Is the median of paired x and y differences greater than \(m_0\)? | Is the median of paired x and y differences less than \(m_0\)? |
Form of Test | Two-tailed | Right-tailed test | Left-tailed test |
Null Hypothesis, \(H_0\) | \(m_d = m_0\) | \(m_d = m_0\) | \(m_d = m_0\) |
Alternate Hypothesis, \(H_1\) | \(m_d \neq m_0\) | \(m_d > m_0\) | \(m_d < m_0\) |
# Create the data samples for the paired samples Wilcoxon signed-rank test
# Values are paired based on matching position in each sample
data_x = c(4.0, 4.6, 3.9, 5.1, 2.5)
data_y = c(2.4, 3.5, 2.5, 4.7, 6.1)
# Run the paired samples Wilcoxon signed-rank test with specifications
wilcox.test(data_x - data_y, alternative = "two.sided",
mu = 0,
conf.int = TRUE, conf.level = 0.95)
# or
wilcox.test(data_x, data_y, alternative = "two.sided",
mu = 0, paired = TRUE,
conf.int = TRUE, conf.level = 0.95)
Wilcoxon signed rank exact test
data: data_x and data_y
V = 10, p-value = 0.625
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
-3.6 1.6
sample estimates:
(pseudo)median
0.9
Argument | Usage |
x - y | x is the first sample data values, y is the second sample data values |
x, y | Used when the argument "paired" is set to TRUE |
paired | Set to TRUE for paired two sample test |
mu | Population median of paired differences in null hypothesis |
alternative | Set alternate hypothesis as "greater", "less", or the default "two.sided" |
exact | For N<50 and no zeroes and no rank ties: Set to FALSE to compute p-value based on normal distribution,
(default = TRUE ) |
correct | For cases with non-exact p-values: Set to FALSE to
remove continuity correction, (default = TRUE ) |
conf.int | Set to TRUE to include the confidence interval,
(default = FALSE ) |
conf.level | Level of confidence for the test and confidence interval, (default = 0.95) |
# Create data
data_x = rnorm(40); data_y = rnorm(40)
# Create object
wsrt_object = wilcox.test(data_x, data_y, alternative = "two.sided",
mu = 0, paired = TRUE,
conf.int = TRUE, conf.level = 0.95)
# Extract a component
wsrt_object$statistic
V
430
Test Component | Usage |
wsrt_object$statistic | Test-statistic value |
wsrt_object$p.value | P-value |
wsrt_object$estimate | Point estimate of median of paired differences when
conf.int = TRUE |
wsrt_object$conf.int | Confidence interval when conf.int = TRUE |
With \(\text{rank}(2, 4, 4, 6, 8, 8, 8) = (1, 2.5, 2.5, 4, 6, 6, 6)\).
Let \(x_i's\), and \(y_i's\) be the sample values,
\(m_0\) is the population median of the paired differences to be tested and set in the null hypothesis,
\(R_i\) is the rank of \(|(x_i - y_i) - m_0|\), among all such absolute differences (distances),
\(I_{([x_i-y_i]-m_0)>0}\) is \(1\) when \(([x_i-y_i]-m_0)>0\) and \(0\) otherwise, and
\(N\) is the number of sample pairs.
The paired samples Wilcoxon signed-rank test has test statistics, \(V\), of the form:
\[V = \sum_{i=1}^N R_i \cdot I_{([x_i-y_i]-m_0)>0}.\] See also the Wilcoxon signed rank test for one sample and the sign test for paired samples.
For unpaired (or independent) samples, see the Wilcoxon rank sum test.
For large samples (\(N\geq50\)), or cases with rank ties or at least one \(([x_i-y_i] - m_0) = 0\):
With \(n\) as the number of non-zero \(([x_i-y_i] - m_0)\), \(T\) as the number of sets of unique ranks, and \(t_k\) as the number of tied values for set \(k\) that are tied at a particular value, inference on \(V\) and the test outcome is based on normal distribution approximation by standardizing \(V\).
With \(\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}=0\) if there are no ties (all \(t_k = 1\)),
\[z = \frac{V - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\] Applying continuity correction for the normal distribution approximation (the default in R),
\[z = \frac{(V + c) - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\]
For two-sided test, \(c=0.5\) if \(V<\frac{n(n+1)}{4}\), \(c=-0.5\) if \(V>\frac{n(n+1)}{4}\), and \(c=0\) if \(V=\frac{n(n+1)}{4}\). For one-sided test, \(c=-0.5\) if the alternative is "greater", and \(c=0.5\) if it is "less".
For small sample pairs (\(N<50\)) with no rank ties or any \(([x_i-y_i] - m_0)=0\):
The p-value is based on the exact distribution of the Wilcoxon signed rank statistic \(V\), with \(\text{size} = N\).
Enter the data by hand.
data_x = c(0.11, -0.64, -0.85, -1.02, 0.12, -0.95,
-0.49, -0.26, 1.84, -0.65, 0.24, 0.08,
-0.96, 0.57, 1.44, 0.45, 0.04, -0.42)
data_y = c(0.44, -1.65, -0.97, -1.30, 0.68, -1.32,
0.49, -0.63, 2.89, -1.70, -1.02, 3.32,
-1.38, -0.93, -2.08, -0.03, 0.56, 0.15)
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).
\(H_0:\) the population median of the paired differences is equal to 0 (\(m_d = 0\)).
\(H_1:\) the population median of the paired differences is not equal to 0 (\(m_d \neq 0\), hence the default two-sided).
Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).
The wilcox.test()
function has the default
alternative as "two.sided", the default median of
paired differences as 0, and the default level of
confidence as 0.95, hence, you do not need to specify the
"alternative", "mu", and "conf.level" arguments in this
case.
wilcox.test(data_x, data_y, paired = TRUE,
alternative = "two.sided",
mu = 0,
conf.int = TRUE, conf.level = 0.95)
Or:
Wilcoxon signed rank exact test
data: data_x and data_y
V = 99, p-value = 0.5798
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
-0.350 0.735
sample estimates:
(pseudo)median
0.225
The estimate of the median of paired differences, \(\tilde d\), is 0.225,
test statistic, \(V\), is 99,
the p-value, \(p\), is 0.5798,
the 95% confidence interval is [-0.350, 0.735].
Note that for wilcox.test()
in R, the two methods may
disagree for some edge cases, as p-value is based on exact distribution
or normal distribution, and confidence interval is sometimes based on
approximations.
P-value: With the p-value (\(p = 0.5798\)) being greater than the level of significance 0.05, we fail to reject the null hypothesis that the population median of the paired differences is equal to 0.
Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 0\)) being inside the confidence interval, \([-0.350, 0.735]\), we fail to reject the null hypothesis that the population median of the paired differences is equal to 0.
Using a subset of the nottem data from the "datasets" package with 10 sample rows from 20 rows below:
nottem_data = data.frame(matrix(nottem, ncol = 12, byrow = TRUE))[,1:6]
colnames(nottem_data) = c("Jan", "Feb", "Mar", "Apr", "May", "June")
rownames(nottem_data) = 1920:1939
nottem_data
Jan Feb Mar Apr May June
1920 40.6 40.8 44.4 46.7 54.1 58.5
1924 39.3 37.5 38.3 45.5 53.2 57.7
1925 40.0 40.5 40.8 45.1 53.8 59.4
1926 39.2 43.4 43.4 48.9 50.6 56.8
1929 34.8 31.3 41.0 43.9 53.1 56.9
1932 42.4 38.4 40.3 44.6 50.9 57.0
1935 40.0 42.6 43.5 47.1 50.0 60.5
1936 37.3 35.0 44.0 43.9 52.7 58.6
1937 40.8 41.0 38.4 47.4 54.1 58.6
1939 39.4 40.9 42.4 47.8 52.4 58.0
For “Apr” as the x group versus “Jan” as the y group.
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\), without continuity correction.
\(H_0:\) the population median of the paired differences is equal to 5 (\(m_d = 5\)).
\(H_1:\) the population median of the paired differences is not equal to 5 (\(m_d \neq 5\), hence the default two-sided).
Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).
wilcox.test(nottem_data$Apr, nottem_data$Jan,
paired = TRUE,
alternative = "two.sided", mu = 5,
correct = FALSE,
conf.int = TRUE, conf.level = 0.9)
Warning in wilcox.test.default(nottem_data$Apr, nottem_data$Jan, paired = TRUE,
: cannot compute exact p-value with ties
Warning in wilcox.test.default(nottem_data$Apr, nottem_data$Jan, paired = TRUE,
: cannot compute exact confidence interval with ties
Wilcoxon signed rank test
data: nottem_data$Apr and nottem_data$Jan
V = 171, p-value = 0.01373
alternative hypothesis: true location shift is not equal to 5
90 percent confidence interval:
5.599914 7.499985
sample estimates:
(pseudo)median
6.537784
The warnings are because there are ties in the data. Hence, p-value is based on normal approximation not exact distribution.
P-value: With the p-value (\(p = 0.01373\)) being less than the level of significance 0.1, we reject the null hypothesis that the population median of the paired differences is equal to 5.
Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 5\)) being outside the confidence interval, \([5.599914, 7.499985]\), we reject the null hypothesis that the population median of the paired differences is equal to 5.
Using a subset of the nottem data from the "datasets" package with 10 sample rows from 20 rows below:
nottem_data = data.frame(matrix(nottem, ncol = 12, byrow = TRUE))[,7:12]
colnames(nottem_data) = c("Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
rownames(nottem_data) = 1920:1939
nottem_data
Jul Aug Sep Oct Nov Dec
1920 57.7 56.4 54.3 50.5 42.9 39.8
1924 60.8 58.2 56.4 49.8 44.4 43.6
1925 63.5 61.0 53.0 50.0 38.1 36.3
1926 62.5 62.0 57.5 46.7 41.6 39.8
1929 62.5 60.3 59.8 49.2 42.9 41.9
1932 62.1 63.5 56.3 47.3 43.6 41.8
1935 64.6 64.0 56.8 48.6 44.2 36.4
1936 60.0 61.1 58.1 49.6 41.6 41.3
1937 61.4 61.8 56.3 50.9 41.4 37.1
1939 60.7 61.8 58.2 46.7 46.6 37.8
For “Jul” as the x group versus “Oct” as the y group.
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\).
\(H_0:\) the population median of the paired differences is equal to 10 (\(m_d = 10\)).
\(H_1:\) the population median of the paired differences is greater than 10 (\(m_d > 10\), hence one-sided).
Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).
wilcox.test(nottem_data$Jul, nottem_data$Oct,
paired = TRUE,
alternative = "greater", mu = 10,
conf.int = TRUE, conf.level = 0.9)
Warning in wilcox.test.default(nottem_data$Jul, nottem_data$Oct, paired = TRUE,
: cannot compute exact p-value with ties
Warning in wilcox.test.default(nottem_data$Jul, nottem_data$Oct, paired = TRUE,
: cannot compute exact confidence interval with ties
Wilcoxon signed rank test with continuity correction
data: nottem_data$Jul and nottem_data$Oct
V = 186, p-value = 0.001324
alternative hypothesis: true location shift is greater than 10
90 percent confidence interval:
11.60003 Inf
sample estimates:
(pseudo)median
12.5
P-value: With the p-value (\(p = 0.001324\)) being less than the level of significance 0.1, we reject the null hypothesis that the population median of the paired differences is equal to 10.
Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 10\)) being outside the confidence interval, \([11.60003, \infty)\), we reject the null hypothesis that the population median of the paired differences is equal to 10.
For “Aug” as the x group versus “Sep” as the y group.
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).
\(H_0:\) the population median of the paired differences is equal to 4.5 (\(m_d = 4.5\)).
\(H_1:\) the population median of the paired differences is less than 4.5 (\(m_d < 4.5\), hence one-sided).
Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).
wilcox.test(nottem_data$Aug, nottem_data$Sep,
paired = TRUE,
alternative = "less", mu = 4.5,
conf.int = TRUE, conf.level = 0.95)
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact p-value with ties
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact confidence interval with ties
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact p-value with zeroes
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact confidence interval with zeroes
Wilcoxon signed rank test with continuity correction
data: nottem_data$Aug and nottem_data$Sep
V = 76, p-value = 0.2283
alternative hypothesis: true location shift is less than 4.5
95 percent confidence interval:
-Inf 5.049966
sample estimates:
(pseudo)median
4.04234
The warnings are because there are ties in the data, and \([x_i-y_i] - m_0 = 0\) for at least one observation. Hence, p-value is based on normal approximation not exact distribution.
P-value: With the p-value (\(p = 0.2283\)) being greater than the level of significance 0.05, we fail to reject the null hypothesis that the population median of the paired differences is equal to 4.5.
Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 4.5\)) being inside the confidence interval, \((-\infty, 5.049966]\), we fail to reject the null hypothesis that the population median of the paired differences is equal to 4.5.
Here for a paired samples Wilcoxon signed-rank test, we show how to
get the test statistics (and z-value), and p-values from the
wilcox.test()
function in R, or by written code.
data_x = nottem_data$Nov; data_y = nottem_data$Dec
wsrt_object = wilcox.test(data_x, data_y, paired = TRUE,
correct = TRUE,
alternative = "two.sided", mu = 5)
Warning in wilcox.test.default(data_x, data_y, paired = TRUE, correct = TRUE, :
cannot compute exact p-value with ties
Wilcoxon signed rank test with continuity correction
data: data_x and data_y
V = 47.5, p-value = 0.03326
alternative hypothesis: true location shift is not equal to 5
\[V = \sum_{i=1}^N R_i \cdot I_{([x_i-y_i]-m_0)>0}.\]
With continuity correction:
\[z = \frac{(V + c) - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\]
For two-sided test, \(c=0.5\) if \(V<\frac{n(n+1)}{4}\), \(c=-0.5\) if \(V>\frac{n(n+1)}{4}\), and \(c=0\) if \(V=\frac{n(n+1)}{4}\). For one-sided test, \(c=-0.5\) if the alternative is "greater", and \(c=0.5\) if it is "less".
V
47.5
[1] 47.5
Same as:
[1] 47.5
For z-value:
c = 0.5 # Given two-sided and V < n*(n + 1)/4 (47.5<105)
t = table(r)
n = length(diffs[diffs!=0])
num = (V + c) - n*(n + 1)/4
denom = sqrt(n*(n+1)*(2*n+1)/24 - sum(t^3 - t)/48)
z = num/denom
z
[1] -2.12889
Two-tailed: For positive z-value (\(z^+\)), and negative z-value (\(z^-\)).
\(Pvalue = 2*P(Z>z^+)\) or \(Pvalue = 2*P(Z<z^-)\).
One-tailed: For right-tail, \(Pvalue = P(Z>z)\) or for left-tail, \(Pvalue = P(Z<z)\).
[1] 0.0332634
Same as:
Note that the p-value depends on the \(\text{test statistics}\) (\(z = -2.12889\)). We also use the
distribution function pnorm()
for the normal distribution
in R.
[1] 0.03326336
[1] 0.03326336
One-tailed example:
The feedback form is a Google form but it does not collect any personal information.
Please click on the link below to go to the Google form.
Thank You!
Go to Feedback Form
Copyright © 2020 - 2024. All Rights Reserved by Stats Codes