Wilcoxon Signed Rank Test in R with Example | R Tutorial 4.8 | MarinStatsLectures

MarinStatsLectures-R Programming & Statistics
29 Aug 201303:38
EducationalLearning
32 Likes 10 Comments

TLDRIn this informative video, Mike Marin introduces the Wilcoxon Signed Rank Test, a non-parametric statistical method for analyzing median differences in paired populations. Using R software, he demonstrates the test with systolic blood pressure data before and after treatment. Viewers learn to use the 'wilcox.test' function, interpret results like the test statistic, p-value, and confidence interval, and understand the implications of ties and zero differences on exact calculations. The tutorial also covers adjusting test parameters for approximate results and the importance of continuity correction.

Takeaways
  • πŸ“š The video is a tutorial on the Wilcoxon Signed Rank Test, a non-parametric statistical method used for paired or dependent populations.
  • 🧬 The example data involves measurements of systolic blood pressure before and after a treatment, with 25 paired observations.
  • πŸ’‘ The 'wilcox.test' function in R is used to conduct the Wilcoxon Signed Rank Test.
  • πŸ” A boxplot is suggested for visual examination of the data to compare before and after measurements.
  • ❓ The null hypothesis tested is that the median change in systolic blood pressure is zero.
  • πŸ“‰ The test is two-sided, meaning it checks for any difference from the median, not just an increase or decrease.
  • πŸ”§ The 'mu' parameter is set to 0 in the 'wilcox.test' function to test if the median difference is zero.
  • πŸ”„ The 'paired' argument is set to TRUE to indicate that the measurements are paired.
  • πŸ“Š The 'conf.int' and 'conf.level' arguments can be used to obtain a confidence interval for the results.
  • ⚠️ Warnings are provided when exact p-values and confidence intervals cannot be calculated due to ties or zero differences.
  • πŸ”„ The 'exact' argument can be set to FALSE for approximate calculations when exact values are not possible.
  • πŸ”§ The 'correct' argument can be set to FALSE to disable continuity correction in the test.
  • πŸ“ˆ The test result includes a test statistic of 267, a p-value of 0.00082, and a 99% confidence interval from 2 to 14, with a sample median difference of 7.5.
  • πŸŽ₯ The next video in the series will discuss one-way analysis of variance (ANOVA).
Q & A
  • What is the Wilcoxon Signed Rank Test used for in statistics?

    -The Wilcoxon Signed Rank Test is a non-parametric statistical method used to examine the median difference between two related populations or paired samples.

  • What kind of data does the Wilcoxon Signed Rank Test require?

    -The test requires paired or dependent data, such as measurements taken before and after an event or treatment.

  • What is the purpose of the 'wilcox.test' command in R?

    -The 'wilcox.test' command in R is used to perform the Wilcoxon Signed Rank Test, allowing for the comparison of differences in paired observations.

  • Why might one examine a boxplot of the data before conducting the Wilcoxon test?

    -Examining a boxplot helps visualize the distribution of the data and can provide initial evidence of the differences between the paired observations, such as changes in blood pressure before and after treatment.

  • What is the null hypothesis being tested in the script's example?

    -The null hypothesis is that the median change in systolic blood pressure is 0, implying no significant difference between the before and after measurements.

  • What does it mean to conduct a two-sided test in the context of the Wilcoxon Signed Rank Test?

    -A two-sided test means that the alternative hypothesis is that the median difference is not equal to zero, allowing for the possibility of the median being either higher or lower after the treatment.

  • How can one set up the 'wilcox.test' function in R to test for a median difference of zero?

    -In the 'wilcox.test' function, one can set the 'mu' argument to 0 to test if the median difference is zero, and set the 'paired' argument to TRUE to indicate that the samples are paired.

  • What is the significance of the 'conf.int' and 'conf.level' arguments in the 'wilcox.test' function?

    -The 'conf.int' argument, when set to TRUE, returns a confidence interval for the median difference, and the 'conf.level' argument specifies the level of confidence for the interval, such as 99%.

  • What does the 'exact' argument in the 'wilcox.test' function control?

    -The 'exact' argument controls whether R should calculate exact p-values and confidence intervals. Setting it to FALSE asks R to calculate approximate values when exact calculations are not possible, such as in the presence of ties.

  • What is the purpose of the 'correct' argument in the 'wilcox.test' function?

    -The 'correct' argument, when set to FALSE, tells R not to apply a continuity correction when calculating the test statistic, which can be used in cases where the exact distribution is used.

  • What results were obtained from the Wilcoxon test in the script?

    -The test resulted in a test statistic of 267, a p-value of 0.00082, indicating a statistically significant difference, and a 99% confidence interval from 2 to 14 for the median difference in systolic blood pressure.

Outlines
00:00
πŸ“Š Introduction to Wilcoxon Signed Rank Test in R

In this video, Mike Marin introduces the Wilcoxon Signed Rank Test, a non-parametric statistical method used to evaluate the median difference between two related populations. The context provided is a study of systolic blood pressure changes before and after a treatment, with 25 paired observations. The video demonstrates the use of R software to perform this test, starting with importing and examining the data visually through a boxplot. The 'wilcox.test' function in R is highlighted as the primary tool for conducting the test, with a focus on testing the null hypothesis that the median change in blood pressure is zero, using a two-sided test approach.

πŸ” Conducting the Wilcoxon Signed Rank Test with R

The script details the steps to conduct the Wilcoxon Signed Rank Test in R, including accessing help for the 'wilcox.test' function and setting up the test with specific arguments. The 'mu' argument is set to zero to test for no difference in medians, 'alt' is set to 'two.sided' for a two-sided test, and 'paired' is set to TRUE to indicate the paired nature of the data. The script also discusses the option to include a confidence interval by setting 'conf.int' to TRUE and specifying the 'conf.level'. The output of the test, including the test statistic, p-value, confidence interval, and sample median difference, is explained, along with the implications of ties in the ranks and the option to calculate approximate values by setting 'exact' to FALSE.

πŸ“š Conclusion and Upcoming Content

The video concludes with a summary of the results obtained from the Wilcoxon Signed Rank Test, including the test statistic of 267, a p-value of 0.00082, and a 99% confidence interval ranging from 2 to 14, indicating a significant median difference in systolic blood pressure post-treatment. The script also teases the next topic in the series, which will be one-way analysis of variance (ANOVA), and encourages viewers to subscribe to 'marinstatslectures' for further statistical insights.

Mindmap
Keywords
πŸ’‘Wilcoxon Signed Rank Test
The Wilcoxon Signed Rank Test is a non-parametric statistical method used to determine whether there is a statistically significant difference between two paired or dependent samples. In the context of the video, it is used to examine the median difference in systolic blood pressure before and after a treatment. The test is appropriate when the data does not meet the assumptions of a parametric test, such as normal distribution.
πŸ’‘R Statistical Software
R Statistical Software is a programming language and software environment for statistical computing and graphics. It is widely used among statisticians and data analysts for data manipulation, statistical modeling, and visualization. In the video, R is used to perform the Wilcoxon Signed Rank Test on the systolic blood pressure data.
πŸ’‘non-parametric
Non-parametric methods in statistics do not assume a specific distribution for the data, unlike parametric methods which assume data follows a certain distribution, such as normal. The Wilcoxon Signed Rank Test is a non-parametric test, making it useful when the data does not meet the normality assumption, as mentioned in the script.
πŸ’‘paired observations
Paired observations refer to data points that are matched in pairs, often used in before-and-after studies. In the video, the paired observations are the systolic blood pressure measurements taken before and after the treatment, which allows for a direct comparison of the effect of the treatment.
πŸ’‘boxplot
A boxplot, or box-and-whisker plot, is a standardized way of displaying the distribution of a dataset based on five number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum. In the video, a boxplot is used to visually compare the before and after measurements of systolic blood pressure.
πŸ’‘null hypothesis
The null hypothesis is a statement of no effect or no difference, which is tested in a statistical hypothesis test. In the script, the null hypothesis is that the median change in systolic blood pressure is 0, implying no effect of the treatment.
πŸ’‘two-sided test
A two-sided test is a type of hypothesis test that considers the possibility of a difference in either direction (e.g., the treatment could be better or worse). In the video, a two-sided test is conducted to determine if there is any significant difference in the median change of systolic blood pressure, regardless of direction.
πŸ’‘wilcox.test
The 'wilcox.test' is a function in R used to perform the Wilcoxon signed rank test for paired samples. In the video, this function is used to analyze the change in systolic blood pressure, with specific arguments set to define the test as two-sided and paired.
πŸ’‘p-value
The p-value is the probability that the observed data (or something more extreme) would occur if the null hypothesis were true. A small p-value, like the one mentioned in the script (0.00082), indicates strong evidence against the null hypothesis, suggesting that the treatment had a significant effect on systolic blood pressure.
πŸ’‘confidence interval
A confidence interval provides a range of values that are likely to contain the true population parameter with a certain level of confidence. In the video, a 99% confidence interval is calculated for the median difference in systolic blood pressure, indicating the range within which the true median difference is likely to fall.
πŸ’‘ties
Ties occur when two or more data points have the same value. In the context of the Wilcoxon Signed Rank Test, ties can affect the calculation of the exact p-value and confidence interval, as mentioned in the script. When ties are present, an approximate p-value and confidence interval may be used instead.
πŸ’‘continuity correction
Continuity correction is a statistical adjustment used in hypothesis testing to account for the discrete nature of the data when the test assumes a continuous distribution. In the script, it is mentioned that the 'correct' argument can be set to FALSE to avoid using a continuity correction in the Wilcoxon test.
Highlights

Introduction to the Wilcoxon Signed Rank Test by Mike Marin.

The test is a non-parametric method for examining median differences in paired populations.

The use case involves systolic blood pressure measurements before and after treatment.

Data consists of 25 paired observations.

Importing and attaching data in R Statistical Software.

Exploring changes in systolic blood pressure using the 'wilcox.test' command in R.

Using the Help menu in R for command assistance.

Visual analysis with a boxplot to compare before and after measurements.

Null hypothesis testing for a median change of 0 in systolic blood pressure.

Conducting a two-sided test using 'wilcox.test' with the 'alt' argument set to 'two.sided'.

Setting the 'paired' argument to TRUE for paired measurements.

Obtaining a confidence interval with the 'conf.int' and 'conf.level' arguments.

Handling ties and differences of 0 in rank calculations.

Using the 'exact' argument to calculate approximate p-values and confidence intervals.

Disabling continuity correction with the 'correct' argument.

Results presentation including test statistic, p-value, confidence interval, and median difference.

Upcoming discussion on one-way ANOVA in the next video.

Encouragement to subscribe to MarinStatsLectures for more statistical insights.

Transcripts
Rate This

5.0 / 5 (0 votes)

Thanks for rating: