Here, we discuss the paired samples Wilcoxon signed-rank test in R with interpretations, including, test statistics, p-values, and confidence intervals.

The paired samples Wilcoxon signed-rank test in R can be performed with the wilcox.test() function from the base "stats" package.

The paired samples Wilcoxon signed-rank test, with the assumption that the paired differences have a symmetric distribution, can be used to test whether the median of the differences between paired values from the two populations where two dependent samples come from is equal to a certain value (which is stated in the null hypothesis) or not. It is a non-parametric alternative to the paired samples t-test.

In the paired samples Wilcoxon signed-rank test, the test statistic is based on the signs of ranks. It is the sum of the ranks of the observed paired differences that are greater than the null hypothesis median, where the ranks are based on the distances of the observed paired differences to the null hypothesis median.

Paired Samples Wilcoxon Signed-Rank Tests & Hypotheses
With the assumption that the paired differences have a symmetric distribution.
Question Is the median of paired x and y differences equal to \(m_0\)? Is the median of paired x and y differences greater than \(m_0\)? Is the median of paired x and y differences less than \(m_0\)?
Form of Test Two-tailed Right-tailed test Left-tailed test
Null Hypothesis, \(H_0\) \(m_d = m_0\) \(m_d = m_0\) \(m_d = m_0\)
Alternate Hypothesis, \(H_1\) \(m_d \neq m_0\) \(m_d > m_0\) \(m_d < m_0\)

Sample Steps to Run a Paired Samples Wilcoxon Signed-Rank Test:

# Create the data samples for the paired samples Wilcoxon signed-rank test
# Values are paired based on matching position in each sample

data_x = c(4.0, 4.6, 3.9, 5.1, 2.5)
data_y = c(2.4, 3.5, 2.5, 4.7, 6.1)

# Run the paired samples Wilcoxon signed-rank test with specifications

wilcox.test(data_x - data_y, alternative = "two.sided",
            mu = 0,
            conf.int = TRUE, conf.level = 0.95)

# or

wilcox.test(data_x, data_y, alternative = "two.sided",
            mu = 0, paired = TRUE,
            conf.int = TRUE, conf.level = 0.95)

    Wilcoxon signed rank exact test

data:  data_x and data_y
V = 10, p-value = 0.625
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
 -3.6  1.6
sample estimates:
(pseudo)median 
           0.9 
Table of Some Paired Samples Wilcoxon Signed-Rank Test Arguments in R
Argument Usage
x - y x is the first sample data values, y is the second sample data values
x, y Used when the argument "paired" is set to TRUE
paired Set to TRUE for paired two sample test
mu Population median of paired differences in null hypothesis
alternative Set alternate hypothesis as "greater", "less", or the default "two.sided"
exact For N<50 and no zeroes and no rank ties:
Set to FALSE to compute p-value based on normal distribution, (default = TRUE)
correct For cases with non-exact p-values:
Set to FALSE to remove continuity correction, (default = TRUE)
conf.int Set to TRUE to include the confidence interval, (default = FALSE)
conf.level Level of confidence for the test and confidence interval, (default = 0.95)

Creating a Paired Samples Wilcoxon Signed-Rank Test Object:

# Create data
data_x = rnorm(40); data_y = rnorm(40)

# Create object
wsrt_object = wilcox.test(data_x, data_y, alternative = "two.sided",
                          mu = 0, paired = TRUE,
                          conf.int = TRUE, conf.level = 0.95)

# Extract a component
wsrt_object$statistic
  V 
430 
Table of Some Paired Samples Wilcoxon Signed-Rank Test Object Outputs in R
Test Component Usage
wsrt_object$statistic Test-statistic value
wsrt_object$p.value P-value
wsrt_object$estimate Point estimate of median of paired differences when conf.int = TRUE
wsrt_object$conf.int Confidence interval when conf.int = TRUE

1 Test Statistic for Paired Samples Wilcoxon Signed-Rank Test in R

With \(\text{rank}(2, 4, 4, 6, 8, 8, 8) = (1, 2.5, 2.5, 4, 6, 6, 6)\).

Let \(x_i's\), and \(y_i's\) be the sample values,

\(m_0\) is the population median of the paired differences to be tested and set in the null hypothesis,

\(R_i\) is the rank of \(|(x_i - y_i) - m_0|\), among all such absolute differences (distances),

\(I_{([x_i-y_i]-m_0)>0}\) is \(1\) when \(([x_i-y_i]-m_0)>0\) and \(0\) otherwise, and

\(N\) is the number of sample pairs.

The paired samples Wilcoxon signed-rank test has test statistics, \(V\), of the form:

\[V = \sum_{i=1}^N R_i \cdot I_{([x_i-y_i]-m_0)>0}.\] See also the Wilcoxon signed rank test for one sample and the sign test for paired samples.

For unpaired (or independent) samples, see the Wilcoxon rank sum test.

Large Samples:

For large samples (\(N\geq50\)), or cases with rank ties or at least one \(([x_i-y_i] - m_0) = 0\):

With \(n\) as the number of non-zero \(([x_i-y_i] - m_0)\), \(T\) as the number of sets of unique ranks, and \(t_k\) as the number of tied values for set \(k\) that are tied at a particular value, inference on \(V\) and the test outcome is based on normal distribution approximation by standardizing \(V\).

With \(\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}=0\) if there are no ties (all \(t_k = 1\)),

\[z = \frac{V - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\] Applying continuity correction for the normal distribution approximation (the default in R),

\[z = \frac{(V + c) - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\]

For two-sided test, \(c=0.5\) if \(V<\frac{n(n+1)}{4}\), \(c=-0.5\) if \(V>\frac{n(n+1)}{4}\), and \(c=0\) if \(V=\frac{n(n+1)}{4}\). For one-sided test, \(c=-0.5\) if the alternative is "greater", and \(c=0.5\) if it is "less".

Small Samples with No Rank Ties or Median Match:

For small sample pairs (\(N<50\)) with no rank ties or any \(([x_i-y_i] - m_0)=0\):

The p-value is based on the exact distribution of the Wilcoxon signed rank statistic \(V\), with \(\text{size} = N\).

2 Simple Paired Samples Wilcoxon Signed-Rank Test in R

Enter the data by hand.

data_x = c(0.11, -0.64, -0.85, -1.02, 0.12, -0.95,
           -0.49, -0.26, 1.84, -0.65, 0.24, 0.08,
           -0.96, 0.57, 1.44, 0.45, 0.04, -0.42)
data_y = c(0.44, -1.65, -0.97, -1.30, 0.68, -1.32,
           0.49, -0.63, 2.89, -1.70, -1.02, 3.32,
           -1.38, -0.93, -2.08, -0.03, 0.56, 0.15)

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).

\(H_0:\) the population median of the paired differences is equal to 0 (\(m_d = 0\)).

\(H_1:\) the population median of the paired differences is not equal to 0 (\(m_d \neq 0\), hence the default two-sided).

Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).

The wilcox.test() function has the default alternative as "two.sided", the default median of paired differences as 0, and the default level of confidence as 0.95, hence, you do not need to specify the "alternative", "mu", and "conf.level" arguments in this case.

wilcox.test(data_x, data_y, paired = TRUE,
            alternative = "two.sided",
            mu = 0, 
            conf.int = TRUE, conf.level = 0.95)

Or:

wilcox.test(data_x, data_y, paired = TRUE,
            conf.int = TRUE)

    Wilcoxon signed rank exact test

data:  data_x and data_y
V = 99, p-value = 0.5798
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
 -0.350  0.735
sample estimates:
(pseudo)median 
         0.225 

The estimate of the median of paired differences, \(\tilde d\), is 0.225,

test statistic, \(V\), is 99,

the p-value, \(p\), is 0.5798,

the 95% confidence interval is [-0.350, 0.735].

Interpretation:

Note that for wilcox.test() in R, the two methods may disagree for some edge cases, as p-value is based on exact distribution or normal distribution, and confidence interval is sometimes based on approximations.

  • P-value: With the p-value (\(p = 0.5798\)) being greater than the level of significance 0.05, we fail to reject the null hypothesis that the population median of the paired differences is equal to 0.

  • Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 0\)) being inside the confidence interval, \([-0.350, 0.735]\), we fail to reject the null hypothesis that the population median of the paired differences is equal to 0.

3 Two-tailed Paired Samples Wilcoxon Signed-Rank Test in R

Using a subset of the nottem data from the "datasets" package with 10 sample rows from 20 rows below:

nottem_data = data.frame(matrix(nottem, ncol = 12, byrow = TRUE))[,1:6]
colnames(nottem_data) = c("Jan", "Feb", "Mar", "Apr", "May", "June")
rownames(nottem_data) = 1920:1939
nottem_data
      Jan  Feb  Mar  Apr  May June
1920 40.6 40.8 44.4 46.7 54.1 58.5
1924 39.3 37.5 38.3 45.5 53.2 57.7
1925 40.0 40.5 40.8 45.1 53.8 59.4
1926 39.2 43.4 43.4 48.9 50.6 56.8
1929 34.8 31.3 41.0 43.9 53.1 56.9
1932 42.4 38.4 40.3 44.6 50.9 57.0
1935 40.0 42.6 43.5 47.1 50.0 60.5
1936 37.3 35.0 44.0 43.9 52.7 58.6
1937 40.8 41.0 38.4 47.4 54.1 58.6
1939 39.4 40.9 42.4 47.8 52.4 58.0

For “Apr” as the x group versus “Jan” as the y group.

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\), without continuity correction.

\(H_0:\) the population median of the paired differences is equal to 5 (\(m_d = 5\)).

\(H_1:\) the population median of the paired differences is not equal to 5 (\(m_d \neq 5\), hence the default two-sided).

Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).

wilcox.test(nottem_data$Apr, nottem_data$Jan,
            paired = TRUE,
            alternative = "two.sided", mu = 5,
            correct = FALSE,
            conf.int = TRUE, conf.level = 0.9)
Warning in wilcox.test.default(nottem_data$Apr, nottem_data$Jan, paired = TRUE,
: cannot compute exact p-value with ties
Warning in wilcox.test.default(nottem_data$Apr, nottem_data$Jan, paired = TRUE,
: cannot compute exact confidence interval with ties

    Wilcoxon signed rank test

data:  nottem_data$Apr and nottem_data$Jan
V = 171, p-value = 0.01373
alternative hypothesis: true location shift is not equal to 5
90 percent confidence interval:
 5.599914 7.499985
sample estimates:
(pseudo)median 
      6.537784 

The warnings are because there are ties in the data. Hence, p-value is based on normal approximation not exact distribution.

Interpretation:

  • P-value: With the p-value (\(p = 0.01373\)) being less than the level of significance 0.1, we reject the null hypothesis that the population median of the paired differences is equal to 5.

  • Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 5\)) being outside the confidence interval, \([5.599914, 7.499985]\), we reject the null hypothesis that the population median of the paired differences is equal to 5.

4 One-tailed Paired Samples Wilcoxon Signed-Rank Test in R

Right Tailed Test

Using a subset of the nottem data from the "datasets" package with 10 sample rows from 20 rows below:

nottem_data = data.frame(matrix(nottem, ncol = 12, byrow = TRUE))[,7:12]
colnames(nottem_data) = c("Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
rownames(nottem_data) = 1920:1939
nottem_data
      Jul  Aug  Sep  Oct  Nov  Dec
1920 57.7 56.4 54.3 50.5 42.9 39.8
1924 60.8 58.2 56.4 49.8 44.4 43.6
1925 63.5 61.0 53.0 50.0 38.1 36.3
1926 62.5 62.0 57.5 46.7 41.6 39.8
1929 62.5 60.3 59.8 49.2 42.9 41.9
1932 62.1 63.5 56.3 47.3 43.6 41.8
1935 64.6 64.0 56.8 48.6 44.2 36.4
1936 60.0 61.1 58.1 49.6 41.6 41.3
1937 61.4 61.8 56.3 50.9 41.4 37.1
1939 60.7 61.8 58.2 46.7 46.6 37.8

For “Jul” as the x group versus “Oct” as the y group.

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\).

\(H_0:\) the population median of the paired differences is equal to 10 (\(m_d = 10\)).

\(H_1:\) the population median of the paired differences is greater than 10 (\(m_d > 10\), hence one-sided).

Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).

wilcox.test(nottem_data$Jul, nottem_data$Oct,
            paired = TRUE,
            alternative = "greater", mu = 10,
            conf.int = TRUE, conf.level = 0.9)
Warning in wilcox.test.default(nottem_data$Jul, nottem_data$Oct, paired = TRUE,
: cannot compute exact p-value with ties
Warning in wilcox.test.default(nottem_data$Jul, nottem_data$Oct, paired = TRUE,
: cannot compute exact confidence interval with ties

    Wilcoxon signed rank test with continuity correction

data:  nottem_data$Jul and nottem_data$Oct
V = 186, p-value = 0.001324
alternative hypothesis: true location shift is greater than 10
90 percent confidence interval:
 11.60003      Inf
sample estimates:
(pseudo)median 
          12.5 

Interpretation:

  • P-value: With the p-value (\(p = 0.001324\)) being less than the level of significance 0.1, we reject the null hypothesis that the population median of the paired differences is equal to 10.

  • Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 10\)) being outside the confidence interval, \([11.60003, \infty)\), we reject the null hypothesis that the population median of the paired differences is equal to 10.

Left Tailed Test

For “Aug” as the x group versus “Sep” as the y group.

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).

\(H_0:\) the population median of the paired differences is equal to 4.5 (\(m_d = 4.5\)).

\(H_1:\) the population median of the paired differences is less than 4.5 (\(m_d < 4.5\), hence one-sided).

Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).

wilcox.test(nottem_data$Aug, nottem_data$Sep,
            paired = TRUE,
            alternative = "less", mu = 4.5,
            conf.int = TRUE, conf.level = 0.95)
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact p-value with ties
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact confidence interval with ties
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact p-value with zeroes
Warning in wilcox.test.default(nottem_data$Aug, nottem_data$Sep, paired = TRUE,
: cannot compute exact confidence interval with zeroes

    Wilcoxon signed rank test with continuity correction

data:  nottem_data$Aug and nottem_data$Sep
V = 76, p-value = 0.2283
alternative hypothesis: true location shift is less than 4.5
95 percent confidence interval:
     -Inf 5.049966
sample estimates:
(pseudo)median 
       4.04234 

The warnings are because there are ties in the data, and \([x_i-y_i] - m_0 = 0\) for at least one observation. Hence, p-value is based on normal approximation not exact distribution.

Interpretation:

  • P-value: With the p-value (\(p = 0.2283\)) being greater than the level of significance 0.05, we fail to reject the null hypothesis that the population median of the paired differences is equal to 4.5.

  • Confidence Interval: With the null hypothesis median of paired differences (\(m_d = 4.5\)) being inside the confidence interval, \((-\infty, 5.049966]\), we fail to reject the null hypothesis that the population median of the paired differences is equal to 4.5.

5 Paired Samples Wilcoxon Signed-Rank Test: Test Statistics & P-values in R

Here for a paired samples Wilcoxon signed-rank test, we show how to get the test statistics (and z-value), and p-values from the wilcox.test() function in R, or by written code.

data_x = nottem_data$Nov; data_y = nottem_data$Dec
wsrt_object = wilcox.test(data_x, data_y, paired = TRUE,
                          correct = TRUE,
                          alternative = "two.sided", mu = 5)
Warning in wilcox.test.default(data_x, data_y, paired = TRUE, correct = TRUE, :
cannot compute exact p-value with ties
wsrt_object

    Wilcoxon signed rank test with continuity correction

data:  data_x and data_y
V = 47.5, p-value = 0.03326
alternative hypothesis: true location shift is not equal to 5

To get the test statistic and z-value:

\[V = \sum_{i=1}^N R_i \cdot I_{([x_i-y_i]-m_0)>0}.\]

With continuity correction:

\[z = \frac{(V + c) - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\]

For two-sided test, \(c=0.5\) if \(V<\frac{n(n+1)}{4}\), \(c=-0.5\) if \(V>\frac{n(n+1)}{4}\), and \(c=0\) if \(V=\frac{n(n+1)}{4}\). For one-sided test, \(c=-0.5\) if the alternative is "greater", and \(c=0.5\) if it is "less".

wsrt_object$statistic
   V 
47.5 
# to remove name V
unname(wsrt_object$statistic)
[1] 47.5

Same as:

mu = 5; diffs = (data_x - data_y) - mu
r = rank(abs(diffs))
V = sum(r[diffs>0])
V
[1] 47.5

For z-value:

c = 0.5 # Given two-sided and V < n*(n + 1)/4 (47.5<105)
t = table(r)
n = length(diffs[diffs!=0])
num = (V + c) - n*(n + 1)/4
denom = sqrt(n*(n+1)*(2*n+1)/24 - sum(t^3 - t)/48)
z = num/denom
z
[1] -2.12889

To get the p-value for normal approximation:

Two-tailed: For positive z-value (\(z^+\)), and negative z-value (\(z^-\)).

\(Pvalue = 2*P(Z>z^+)\) or \(Pvalue = 2*P(Z<z^-)\).

One-tailed: For right-tail, \(Pvalue = P(Z>z)\) or for left-tail, \(Pvalue = P(Z<z)\).

wsrt_object$p.value
[1] 0.0332634

Same as:

Note that the p-value depends on the \(\text{test statistics}\) (\(z = -2.12889\)). We also use the distribution function pnorm() for the normal distribution in R.

2*pnorm(-2.12889); 2*(1-pnorm(2.12889))
[1] 0.03326336
[1] 0.03326336

One-tailed example:

# Right tailed
1-pnorm(2.12889)
# Left tailed
pnorm(-2.12889)

Copyright © 2020 - 2024. All Rights Reserved by Stats Codes