Here, we discuss the sign test for one sample in R with interpretations, including, test statistics, p-values, and confidence intervals.
The sign test for one sample in R can be performed with the
SIGN.test()
function from the "BSDA" package.
The sign test for one sample can be used to test whether the median of the population an independent random sample comes from is equal to a certain value (which is stated in the null hypothesis) or not. It is a non-parametric alternative to the one sample t-test.
In the sign test for one sample, the test statistic is based on the signs of the observed values minus the null hypothesis median. It is the number of observations greater than the null hypothesis median, and inference is based on binomial distribution.
Question | Is the median equal to \(m_0\)? | Is the median greater than \(m_0\)? | Is the median less than \(m_0\)? |
Form of Test | Two-tailed | Right-tailed test | Left-tailed test |
Null Hypothesis, \(H_0\) | \(m = m_0\) | \(m = m_0\) | \(m = m_0\) |
Alternate Hypothesis, \(H_1\) | \(m \neq m_0\) | \(m > m_0\) | \(m < m_0\) |
To use the SIGN.test()
function form the "BSDA" package,
first install the package, then load it as follows:
# Create the data for the sign test for one sample
data = c(1.3, 0.5, -0.4, 1.4, -1.5, 0.5)
# Run the sign test for one sample with specifications
SIGN.test(data, alternative = "two.sided",
md = 0, conf.level = 0.95)
One-sample Sign-Test
data: data
s = 4, p-value = 0.6875
alternative hypothesis: true median is not equal to 0
95 percent confidence interval:
-1.39 1.39
sample estimates:
median of x
0.5
Achieved and Interpolated Confidence Intervals:
Conf.Level L.E.pt U.E.pt
Lower Achieved CI 0.7812 -0.40 1.30
Interpolated CI 0.9500 -1.39 1.39
Upper Achieved CI 0.9688 -1.50 1.40
Argument | Usage |
x | Sample data values |
md | Population median value in null hypothesis |
alternative | Set alternate hypothesis as "greater", "less", or the default "two.sided" |
conf.level | Level of confidence for the test and confidence interval (default = 0.95) |
# Create data
data = rnorm(100)
# Create object
sgnt_object = SIGN.test(data, alternative = "two.sided",
md = 0, conf.level = 0.95)
# Extract a component
sgnt_object$statistic
s
52
Test Component | Usage |
sgnt_object$statistic | Test-statistic; number of values greater than the median |
sgnt_object$p.value | P-value |
sgnt_object$estimate | Point estimate or sample median |
sgnt_object$conf.int | Confidence interval |
The sign test for one sample has test statistics, \(S\), of the form:
\[S = \sum_{i=1}^n I_{(x_i-m_0)>0},\]
which is the number of observations greater than the null hypothesis median.
For \(n_d\) which is the number of non-zero \((x_i - m_0)\), inference on \(S\) is based on the binomial distribution with \(\text{size} = n_d\) and \(\text{prob} = 0.5\).
\(x_i's\) are the sample values,
\(m_0\) is the population median value to be tested and set in the null hypothesis,
\(I_{(x_i-m_0)>0}\) is \(1\) when \((x_i-m_0)>0\) and \(0\) otherwise, and
\(n\) is the sample size.
See also the Wilcoxon signed rank test for one sample and the sign test for paired samples.
Enter the data by hand.
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).
\(H_0:\) the population median is equal to 0 (\(m = 0\)).
\(H_1:\) the population median is not equal to 0 (\(m \neq 0\), hence the default two-sided).
Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).
The SIGN.test()
function has the default
alternative as "two.sided", the default median as
0, and the default level of confidence as
0.95, hence, you do not need to specify the "alternative",
"md", and "conf.level" arguments in this case.
Or:
One-sample Sign-Test
data: data
s = 4, p-value = 0.1185
alternative hypothesis: true median is not equal to 0
95 percent confidence interval:
-1.5643663 0.3930989
sample estimates:
median of x
-0.7
Achieved and Interpolated Confidence Intervals:
Conf.Level L.E.pt U.E.pt
Lower Achieved CI 0.8815 -1.4000 -0.1000
Interpolated CI 0.9500 -1.5644 0.3931
Upper Achieved CI 0.9648 -1.6000 0.5000
The sample median, \(\tilde x\), is -0.7,
test statistic, \(S\), is 4,
the p-value, \(p\), is 0.1185,
the interpolated 95% confidence interval is [-1.5643663, 0.3930989].
Note that for SIGN.test()
in R, the two methods may
disagree for some edge cases, as p-value is based on binomial
distribution, and confidence interval is based on interpolation applying
binomial distribution.
P-value: With the p-value (\(p = 0.1185\)) being greater than the level
of significance 0.05, we fail to reject the null hypothesis that the
population median is equal to 0. This is the same as the "two.sided"
test for \(\text{Binomial($n_d$,
0.5)}\), that is,
binom.test(4, 15, 0.5, "two.sided")$p.value
\(= 0.1184692\).
Confidence Interval: With the null hypothesis median value (\(m = 0\)) being inside the confidence interval, \([-1.5643663, 0.3930989]\), we fail to reject the null hypothesis that the population median is equal to 0.
To get the critical value for a sign test for one sample in R, you
can use the qbinom()
function for binomial distribution to
derive the quantile associated with the given level of significance
value \(\alpha\).
With \(n_d\) as the number of non-zero \((x_i - m_0)\) as above.
For two-tailed test with level of significance \(\alpha\). The critical values are: qbinom(\(\alpha/2\), \(n_d\), 0.5) - 1, and qbinom(\(1-\alpha/2\), \(n_d\), 0.5) + 1.
For one-tailed test with level of significance \(\alpha\). The critical value is: for left-tailed, qbinom(\(\alpha\), \(n_d\), 0.5) - 1; and for right-tailed, qbinom(\(1-\alpha\), \(n_d\), 0.5) + 1.
Example:
For \(\alpha = 0.05\), \(n_d = 65\).
Two-tailed:
[1] 24
[1] 41
One-tailed:
[1] 25
[1] 40
Using the Indometh$conc data from the "datasets" package with 10 sample observations from 66 observations below:
[1] 1.50 0.07 0.36 2.72 0.22 0.59 2.05 0.39 0.84 0.09
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\).
\(H_0:\) the population median is equal to 0.15 (\(m = 0.15\)).
\(H_1:\) the population median is not equal to 0.15 (\(m \neq 0.15\), hence the default two-sided).
Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).
One-sample Sign-Test
data: Indometh$conc
s = 43, p-value = 0.01866
alternative hypothesis: true median is not equal to 0.15
90 percent confidence interval:
0.1980247 0.5017279
sample estimates:
median of x
0.34
Achieved and Interpolated Confidence Intervals:
Conf.Level L.E.pt U.E.pt
Lower Achieved CI 0.8911 0.200 0.4800
Interpolated CI 0.9000 0.198 0.5017
Upper Achieved CI 0.9360 0.190 0.5900
P-value: With the p-value (\(p = 0.01866\)) being less than the level of significance 0.1, we reject the null hypothesis that the population median is equal to 0.15.
\(S\) T-statistic: With \(n_d = 66\), and test statistics value (\(S = 43\)) being greater than or equal to \(\text{qbinom(0.95, 66, 0.5)+1}=41\), or being within \(41 \text{ to } 66\), we reject the null hypothesis that the population median is equal to 0.15.
Confidence Interval: With the null hypothesis median value (\(m = 0.15\)) being outside the confidence interval, \([0.1980247, 0.5017279]\), we reject the null hypothesis that the population median is equal to 0.15.
Using the Indometh$conc data from the "datasets" package with 10 sample observations from 66 observations below:
[1] 1.50 0.48 0.70 0.64 0.08 0.07 0.13 0.11 0.06 0.09
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.1\).
\(H_0:\) the population median is equal to 0.2 (\(m = 0.2\)).
\(H_1:\) the population median is greater than 0.2 (\(m > 0.2\), hence one-sided).
Because the level of significance is \(\alpha=0.1\), the level of confidence is \(1 - \alpha = 0.9\).
One-sample Sign-Test
data: Indometh$conc
s = 39, p-value = 0.06802
alternative hypothesis: true median is greater than 0.2
90 percent confidence interval:
0.2226711 Inf
sample estimates:
median of x
0.34
Achieved and Interpolated Confidence Intervals:
Conf.Level L.E.pt U.E.pt
Lower Achieved CI 0.8661 0.2300 Inf
Interpolated CI 0.9000 0.2227 Inf
Upper Achieved CI 0.9124 0.2200 Inf
P-value: With the p-value (\(p = 0.06802\)) being less than the level of significance 0.1, we reject the null hypothesis that the population median is equal to 0.2.
\(S\) T-statistic: \(n_d = 66 - 1 = 65\) because one observation is \(0.2\). With \(n_d = 65\), and test statistics value (\(S = 39\)) being greater than or equal to \(\text{qbinom(0.9, 65, 0.5)+1}=39\), or being within \(39 \text{ to } 65\), we reject the null hypothesis that the population median is equal to 0.2.
Confidence Interval: With the null hypothesis median value (\(m = 0.2\)) being outside the confidence interval, \([0.2226711, \infty)\), we reject the null hypothesis that the population median is equal to 0.2.
For the following null hypothesis \(H_0\), and alternative hypothesis \(H_1\), with the level of significance \(\alpha=0.05\).
\(H_0:\) the population median is equal to 0.4 (\(m = 0.4\)).
\(H_1:\) the population median is less than 0.4 (\(m < 0.4\), hence one-sided).
Because the level of significance is \(\alpha=0.05\), the level of confidence is \(1 - \alpha = 0.95\).
One-sample Sign-Test
data: Indometh$conc
s = 28, p-value = 0.1605
alternative hypothesis: true median is less than 0.4
95 percent confidence interval:
-Inf 0.5017279
sample estimates:
median of x
0.34
Achieved and Interpolated Confidence Intervals:
Conf.Level L.E.pt U.E.pt
Lower Achieved CI 0.9456 -Inf 0.4800
Interpolated CI 0.9500 -Inf 0.5017
Upper Achieved CI 0.9680 -Inf 0.5900
P-value: With the p-value (\(p = 0.1605\)) being greater than the level of significance 0.05, we fail to reject the null hypothesis that the population median is equal to 0.4.
\(S\) T-statistic: \(n_d = 66 - 1 = 65\) because one observation is \(0.4\). With \(n_d = 65\), and test statistics value (\(S = 28\)) being greater than \(\text{qbinom(0.05, 65, 0.5)-1}=25\), or not being within \(0 \text{ to } 25\), we fail to reject the null hypothesis that the population median is equal to 0.4.
Confidence Interval: With the null hypothesis median value (\(m = 0.4\)) being inside the confidence interval, \((-\infty, 0.5017279]\), we fail to reject the null hypothesis that the population median is equal to 0.4.
Here for a sign test for one sample, we show how to get the test
statistics (or S-value), and p-values from the SIGN.test()
function in R, or by written code.
data_os = Indometh$conc
sgnt_object = SIGN.test(data_os, alternative = "two.sided",
md = 0.3, conf.level = 0.9)
sgnt_object
One-sample Sign-Test
data: data_os
s = 34, p-value = 0.8043
alternative hypothesis: true median is not equal to 0.3
90 percent confidence interval:
0.1980247 0.5017279
sample estimates:
median of x
0.34
Achieved and Interpolated Confidence Intervals:
Conf.Level L.E.pt U.E.pt
Lower Achieved CI 0.8911 0.200 0.4800
Interpolated CI 0.9000 0.198 0.5017
Upper Achieved CI 0.9360 0.190 0.5900
\[S = \sum_{i=1}^n I_{(x_i-m_0)>0},\]
which is the number of observations greater than the null hypothesis median.
s
34
[1] 34
Same as:
[1] 34
with \(n_d\) as the number of non-zero \((x_i - m_0)\).
For \(X \sim Binomial(n_d, 0.5)\)
Two-tailed:
For \(S = n_d/2\), \(Pvalue = 1\).
For \(S > n_d/2\), \(Pvalue = 2 * P(X \geq S) = 2 * \sum_{x=S}^{n_d} P(X=x)\).
For \(S < n_d/2\), \(Pvalue = 2 * P(X \leq S) = 2 * \sum_{x=0}^{S} P(X=x)\).
One-tailed:
For right-tail, \(Pvalue = P(X \geq S) = \sum_{x=S}^{n_d} P(X=x)\) or for left-tail, \(Pvalue = P(X \leq S) = \sum_{x=0}^{S} P(X=x)\).
[1] 0.804317
Same as:
[1] 65
Note that the p-value depends on the \(\text{test statistics}\) (\(S = 34 > n_d/2 = 32.5\)). We also use
the distribution function pbinom()
for the binomial
distribution in R.
[1] 0.804317
[1] 0.804317
One-tailed example:
The feedback form is a Google form but it does not collect any personal information.
Please click on the link below to go to the Google form.
Thank You!
Go to Feedback Form
Copyright © 2020 - 2024. All Rights Reserved by Stats Codes