Here, we discuss the one sample Wilcoxon signed-rank test in R with interpretations, including, test statistics, p-values, and confidence intervals.

The one sample Wilcoxon signed-rank test in R can be performed with the wilcox.test() function from the base "stats" package.

The one sample Wilcoxon signed-rank test, with a symmetric distribution assumption, can be used to test whether the median of the population an independent random sample comes from is equal to a certain value (which is stated in the null hypothesis) or not. It is a non-parametric alternative to the one sample t-test.

In the one sample Wilcoxon signed-rank test, the test statistic is based on the sign of ranks. It is the sum of the ranks of observed values that are greater than the null hypothesis median, where the ranks are based on of the distances of each of the observed values to the null hypothesis median.

One Sample Wilcoxon Signed-Rank Tests & Hypotheses
With the assumption that the distribution is symmetric.
Question Is the median equal to \(m_0\)? Is the median greater than \(m_0\)? Is the median less than \(m_0\)?
Form of Test Two-tailed Right-tailed test Left-tailed test
Null Hypothesis, \(H_0\) \(m = m_0\) \(m = m_0\) \(m = m_0\)
Alternate Hypothesis, \(H_1\) \(m \neq m_0\) \(m > m_0\) \(m < m_0\)

Sample Steps to Run a One Sample Wilcoxon Signed-Rank Test:

# Create the data for the one sample Wilcoxon signed-rank test

data = c(0.2, 0.1, -1.0, -0.4, 1.4, 0.5)

# Run the one sample Wilcoxon signed-rank test with specifications

wilcox.test(data, alternative = "two.sided",
            mu = 0,
            conf.int = TRUE, conf.level = 0.95)

    Wilcoxon signed rank exact test

data:  data
V = 13, p-value = 0.6875
alternative hypothesis: true location is not equal to 0
95 percent confidence interval:
 -1.0  1.4
sample estimates:
(pseudo)median 
          0.15 
Table of Some One Sample Wilcoxon Signed-Rank Test Arguments in R
Argument Usage
x Sample data values
mu Population median value in null hypothesis
alternative Set alternate hypothesis as "greater", "less", or the default "two.sided"
exact For n<50 and no zeroes and no rank ties:
Set to FALSE to compute p-value based on normal distribution, (default = TRUE)
correct For cases with non-exact p-values:
Set to FALSE to remove continuity correction, (default = TRUE)
conf.int Set to TRUE to include the confidence interval, (default = FALSE)
conf.level Level of confidence for the test and confidence interval, (default = 0.95)

Creating a One Sample Wilcoxon Signed-Rank Test Object:

# Create data
data = rnorm(30)

# Create object
wsrt_object = wilcox.test(data, alternative = "two.sided",
                          mu = 0,
                          conf.int = TRUE, conf.level = 0.95)

# Extract a component
wsrt_object$statistic
  V 
213 
Table of Some One Sample Wilcoxon Signed-Rank Test Object Outputs in R
Test Component Usage
wsrt_object$statistic Test-statistic value
wsrt_object$p.value P-value
wsrt_object$estimate Point estimate of median when conf.int = TRUE
wsrt_object$conf.int Confidence interval when conf.int = TRUE

1 Test Statistic for One Sample Wilcoxon Signed-Rank Test in R

With \(\text{rank}(1, 2, 2, 3, 4, 4, 4) = (1, 2.5, 2.5, 4, 6, 6, 6)\).

Let \(x_i's\) be the sample values,

\(m_0\) is the population median value to be tested and set in the null hypothesis,

\(R_i\) is the rank of \(|x_i - m_0|\), among all absolute differences (distances),

\(I_{(x_i-m_0)>0}\) is \(1\) when \((x_i-m_0)>0\) and \(0\) otherwise, and

\(N\) is the sample size.

The one sample Wilcoxon signed-rank test has test statistics, \(V\), of the form:

\[V = \sum_{i=1}^N R_i \cdot I_{(x_i-m_0)>0}.\]

See also the sign test for one sample and the Wilcoxon signed rank test for paired samples.

Large Samples:

For large samples (\(N\geq50\)), or cases with rank ties or at least one \((x_i - m_0) = 0\):

With \(n\) as the number of non-zero \((x_i - m_0)\), \(T\) as the number of sets of unique ranks, and \(t_k\) as the number of tied values for set \(k\) that are tied at a particular value, inference on \(V\) is based on normal distribution approximation.

With \(\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}=0\) if there are no ties (all \(t_k =1\)),

\[z = \frac{V - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\] With continuity correction (the default in R) for the normal distribution approximation,

\[z = \frac{(V + c) - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\] For two-sided test, \(c=-0.5\) when \(V>\frac{n(n+1)}{4}\), \(c=0.5\) when \(V<\frac{n(n+1)}{4}\), and \(c=0\) when \(V=\frac{n(n+1)}{4}\). For one-sided test, when the alternative is "greater", \(c=-0.5\), when it is "less", \(c=0.5\).

Small Samples with No Rank Ties or Median Match:

For small samples sizes (\(N<50\)) with no rank ties or any \((x_i - m_0)=0\):

The p-value is based on the exact distribution of the Wilcoxon signed rank statistic \(V\), with \(\text{size} = N\).

2 Simple One Sample Wilcoxon Signed-Rank Test in R

Enter the data by hand.

data = c(0.78, -0.08, 0.25, -0.03, -0.04, 1.37,
         -0.23, 1.52, -1.55, 0.58, 0.12, 0.22,
         0.38, -0.50, -0.33, -1.02, -1.07, 0.30)

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).

\(H_0:\) the population median is equal to 0 (\(m = 0\)).

\(H_1:\) the population median is not equal to 0 (\(m \neq 0\), hence the default two-sided).

Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).

The wilcox.test() function has the default alternative as "two.sided", the default median as 0, and the default level of confidence as 0.95, hence, you do not need to specify the "alternative", "mu", and "conf.level" arguments in this case.

wilcox.test(data, alternative = "two.sided",
            mu = 0,
            conf.int = TRUE, conf.level = 0.95)

Or:

wilcox.test(data, conf.int = TRUE)

    Wilcoxon signed rank exact test

data:  data
V = 92, p-value = 0.7987
alternative hypothesis: true location is not equal to 0
95 percent confidence interval:
 -0.365  0.400
sample estimates:
(pseudo)median 
         0.045 

The estimate of the median, \(\tilde x\), is 0.045,

test statistic, \(V\), is 92,

the p-value, \(p\), is 0.7987,

the 95% confidence interval is [-0.365, 0.400].

Interpretation:

Note that for wilcox.test() in R, the two methods may disagree for some edge cases, as p-value is based on exact distribution or normal distribution, and confidence interval is sometimes based on approximations.

  • P-value: With the p-value (\(p = 0.7987\)) being greater than the level of significance 0.05, we fail to reject the null hypothesis that the population median is equal to 0.

  • Confidence Interval: With the null hypothesis median value (\(m = 0\)) being inside the confidence interval, \([-0.365, 0.400]\), we fail to reject the null hypothesis that the population median is equal to 0.

3 Two-tailed One Sample Wilcoxon Signed-Rank Test in R

Using the sleep$extra data from the "datasets" package with 10 sample observations from 20 observations below:

sleep$extra
 [1]  0.7 -0.1  3.4  3.7  2.0  1.1  4.4  5.5  1.6  3.4

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\), without continuity correction.

\(H_0:\) the population median is equal to 0.5 (\(m = 0.5\)).

\(H_1:\) the population median is not equal to 0.5 (\(m \neq 0.5\), hence the default two-sided).

Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).

wilcox.test(sleep$extra, alternative = "two.sided",
            mu = 0.5, correct = FALSE,
            conf.int = TRUE, conf.level = 0.9)
Warning in wilcox.test.default(sleep$extra, alternative = "two.sided", mu =
0.5, : cannot compute exact p-value with ties
Warning in wilcox.test.default(sleep$extra, alternative = "two.sided", mu =
0.5, : cannot compute exact confidence interval with ties

    Wilcoxon signed rank test

data:  sleep$extra
V = 152, p-value = 0.07924
alternative hypothesis: true location is not equal to 0.5
90 percent confidence interval:
 0.5500271 2.2500210
sample estimates:
(pseudo)median 
       1.49993 

The warnings are because there are ties in the data. Hence, p-value is based on normal approximation not exact distribution.

Interpretation:

  • P-value: With the p-value (\(p = 0.07924\)) being less than the level of significance 0.1, we reject the null hypothesis that the population median is equal to 0.5.

  • Confidence Interval: With the null hypothesis median value (\(m = 0.5\)) being outside the confidence interval, \([0.5500271, 2.2500210]\), we reject the null hypothesis that the population median is equal to 0.5.

4 One-tailed One Sample Wilcoxon Signed-Rank Test in R

Right Tailed Test

Using the trees$Girth data from the "datasets" package with 10 sample observations from 31 observations below:

trees$Girth
 [1]  8.3 10.5 11.3 11.4 12.0 12.9 13.7 13.8 14.5 20.6

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\).

\(H_0:\) the population median is equal to 12.5 (\(m = 12.5\)).

\(H_1:\) the population median is greater than 12.5 (\(m > 12.5\), hence one-sided).

Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).

wilcox.test(trees$Girth, alternative = "greater",
            mu = 12.5,
            conf.int = TRUE, conf.level = 0.9)
Warning in wilcox.test.default(trees$Girth, alternative = "greater", mu = 12.5,
: cannot compute exact p-value with ties
Warning in wilcox.test.default(trees$Girth, alternative = "greater", mu = 12.5,
: cannot compute exact confidence interval with ties

    Wilcoxon signed rank test with continuity correction

data:  trees$Girth
V = 291.5, p-value = 0.1996
alternative hypothesis: true location is greater than 12.5
90 percent confidence interval:
 12.30006      Inf
sample estimates:
(pseudo)median 
      12.99677 

Interpretation:

  • P-value: With the p-value (\(p = 0.1996\)) being greater than the level of significance 0.1, we fail to reject the null hypothesis that the population median is equal to 12.5.

  • Confidence Interval: With the null hypothesis median value (\(m = 12.5\)) being inside the confidence interval, \([12.30006, \infty)\), we fail to reject the null hypothesis that the population median is equal to 12.5.

Left Tailed Test

For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).

\(H_0:\) the population median is equal to 16 (\(m = 16\)).

\(H_1:\) the population median is less than 16 (\(m < 16\), hence one-sided).

Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).

wilcox.test(trees$Girth, alternative = "less",
            mu = 16,
            conf.int = TRUE, conf.level = 0.95)
Warning in wilcox.test.default(trees$Girth, alternative = "less", mu = 16, :
cannot compute exact p-value with ties
Warning in wilcox.test.default(trees$Girth, alternative = "less", mu = 16, :
cannot compute exact confidence interval with ties
Warning in wilcox.test.default(trees$Girth, alternative = "less", mu = 16, :
cannot compute exact p-value with zeroes
Warning in wilcox.test.default(trees$Girth, alternative = "less", mu = 16, :
cannot compute exact confidence interval with zeroes

    Wilcoxon signed rank test with continuity correction

data:  trees$Girth
V = 47.5, p-value = 7.363e-05
alternative hypothesis: true location is less than 16
95 percent confidence interval:
     -Inf 14.20002
sample estimates:
(pseudo)median 
      12.84995 

The warnings are because there are ties in the data, and \(x_i - m_0 = 0\) for at least one observation. Hence, p-value is based on normal approximation not exact distribution.

Interpretation:

  • P-value: With the p-value (\(p = 7.363e-05\)) being less than the level of significance 0.05, we reject the null hypothesis that the population median is equal to 16.

  • Confidence Interval: With the null hypothesis median value (\(m = 16\)) being outside the confidence interval, \((-\infty, 14.20002]\), we reject the null hypothesis that the population median is equal to 16.

5 One Sample Wilcoxon Signed-Rank Test: Test Statistics & P-values in R

Here for a one sample Wilcoxon signed-rank test, we show how to get the test statistics (and z-value), and p-values from the wilcox.test() function in R, or by written code.

data_os = trees$Girth
wsrt_object = wilcox.test(data_os,
                          alternative = "two.sided",
                          correct = TRUE,
                          mu = 15)
Warning in wilcox.test.default(data_os, alternative = "two.sided", correct =
TRUE, : cannot compute exact p-value with ties
wsrt_object

    Wilcoxon signed rank test with continuity correction

data:  data_os
V = 104, p-value = 0.004913
alternative hypothesis: true location is not equal to 15

To get the test statistic and z-value:

\[V = \sum_{i=1}^N R_i \cdot I_{(x_i-m_0)>0}.\]

With continuity correction:

\[z = \frac{(V + c) - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n + 1)(2n + 1)}{24}-\frac{\sum_{k=1}^{T}(t_k^3-t_k)}{48}}}.\]

For two-sided test, \(c=-0.5\) when \(V>\frac{n(n+1)}{4}\), \(c=0.5\) when \(V<\frac{n(n+1)}{4}\), and \(c=0\) when \(V=\frac{n(n+1)}{4}\). For one-sided test, when alternative is "greater", \(c=-0.5\), when alternative is "less", \(c=0.5\).

wsrt_object$statistic
  V 
104 
# to remove name V
unname(wsrt_object$statistic)
[1] 104

Same as:

mu = 15; diffs = data_os - mu
r = rank(abs(diffs))
V = sum(r[diffs>0])
V
[1] 104

For z-value:

c = 0.5 # Given two-sided and V < n*(n + 1)/4 (104<248)
t = table(r)
n = length(diffs[diffs!=0])
num = (V + c) - n*(n + 1)/4
denom = sqrt(n*(n+1)*(2*n+1)/24 - sum(t^3 - t)/48)
z = num/denom
z
[1] -2.812712

To get the p-value for normal approximation:

Two-tailed: For positive z-value (\(z^+\)), and negative z-value (\(z^-\)).

\(Pvalue = 2*P(Z>z^+)\) or \(Pvalue = 2*P(Z<z^-)\).

One-tailed: For right-tail, \(Pvalue = P(Z>z)\) or for left-tail, \(Pvalue = P(Z<z)\).

wsrt_object$p.value
[1] 0.004912565

Same as:

Note that the p-value depends on the \(\text{test statistics}\) (\(z = -2.812712\)). We also use the distribution function pnorm() for the normal distribution in R.

2*pnorm(-2.812712); 2*(1-pnorm(2.812712))
[1] 0.004912563
[1] 0.004912563

One-tailed example:

# Right tailed
1-pnorm(2.812712)
# Left tailed
pnorm(-2.812712)

Copyright © 2020 - 2024. All Rights Reserved by Stats Codes