Permutation Hypothesis Test in R with Examples | R Tutorial 4.6 | MarinStatsLectures

MarinStatsLectures-R Programming & Statistics
9 May 201914:33
EducationalLearning
32 Likes 10 Comments

TLDRIn this educational video, Mike Marin guides viewers through the implementation of a permutation test to compare numeric variables between two groups, an alternative to traditional t-tests or Mann-Whitney U tests. Using R software, the tutorial covers the concept of permutation testing, data exploration, and the calculation of test statistics for mean and median differences. The video also demonstrates how to generate permutation datasets, calculate p-values, and interpret the results, highlighting the importance of considering both statistical and clinical significance in hypothesis testing.

Takeaways
  • πŸ“š The video discusses implementing a permutation test to compare a numeric variable between two groups using statistical software.
  • πŸ” The permutation test is an alternative to the independent two-sample t-test and the Mann-Whitney U test (Wilcoxon rank-sum test).
  • πŸ“ˆ The video provides a recap of the permutation test concept and its application to a specific dataset, with links to related videos and resources.
  • πŸ“Š The dataset involves comparing weight gain after six weeks for chicks on two different diets, with two variables: weight and feed type.
  • πŸ“ Two test statistics are used for demonstration: the absolute difference in mean weight and the absolute difference in median weight for the two feed types.
  • πŸ”’ The video demonstrates calculating these test statistics using R commands, including the mean and median weight for each feed type.
  • πŸ”„ The permutation test involves setting a seed for reproducibility, initializing permutation samples, and generating 100,000 permutation datasets.
  • πŸ”Ž The test statistics are calculated for each permutation sample, comparing the absolute differences in means and medians.
  • πŸ“‰ The p-value is estimated by counting how often the permutation test statistics are more extreme than the observed test statistics.
  • πŸ€” The video emphasizes the difference between statistical significance and scientific or clinical significance, cautioning against relying solely on p-values.
  • πŸ“Š Additional resources include code for plotting the sampling distribution and reshuffling labels, as well as a discussion on the limitations of permutation tests for constructing confidence intervals.
Q & A
  • What is the main topic of the video by Mike Marin?

    -The video discusses implementing a permutation test approach to compare a numeric variable for two groups using statistical software, as an alternative to the independent two-sample t-test or the Mann-Whitney U test.

  • What are the two test statistics used in the video for the permutation test?

    -The two test statistics used are the absolute value of the difference in mean weight for each of the two different food types and the absolute value of the difference in median weight for each of the two food types.

  • What is the data set used in the video about?

    -The data set consists of comparing weight gain after six weeks for chicks on two different diets, specifically casein and meat meal.

  • How many observations are there in the data set used in the video?

    -There are a total of 23 observations in the data set, with 12 measurements on casein and 11 on meat meal.

  • What is the purpose of setting a seed in the permutation test?

    -Setting a seed allows for the generation of the exact same set of random data each time the code is run, which is useful for reproducibility of results.

  • How many permutation datasets are generated in the video's example?

    -100,000 permutation datasets are generated in the video's example.

  • What is the observed absolute difference in means for the two feed types in the video?

    -The observed absolute difference in means is 46.67 grams.

  • What is the observed absolute difference in medians for the two feed types in the video?

    -The observed absolute difference in medians is 79 grams.

  • What is the p-value obtained for test statistic one after running 100,000 permutations?

    -The p-value obtained for test statistic one is approximately 9.747% or 0.09747.

  • What is the p-value obtained for test statistic two after running 100,000 permutations?

    -The p-value obtained for test statistic two is approximately 5.42% or 0.0542.

  • What is the difference between statistical significance and scientific or clinical significance mentioned in the video?

    -Statistical significance refers to whether the observed results are unlikely to have occurred by chance if the null hypothesis were true, often determined by a p-value threshold like 5%. Scientific or clinical significance refers to the practical importance or meaningfulness of the results in a real-world context, which is not strictly determined by p-values.

  • Why might the video suggest that a permutation test might not be the best approach for constructing confidence intervals?

    -The video suggests that permutation testing does not directly allow for the construction of confidence intervals, whereas bootstrapping, a closely related concept, does allow for it.

  • What alternative method to permutation testing is mentioned in the video for constructing confidence intervals?

    -Bootstrapping is mentioned as an alternative method to permutation testing for constructing confidence intervals.

  • What is the relationship between permutation testing and bootstrapping mentioned in the video?

    -Permutation testing and bootstrapping are related in that they both involve resampling of data. However, while permutation testing involves shuffling the entire dataset, bootstrapping involves resampling with replacement from the dataset to create many simulated samples.

Outlines
00:00
πŸ“Š Introduction to Permutation Test for Group Comparison

Mike Marin introduces a video on using permutation tests to compare a numeric variable between two groups, offering an alternative to the t-test or Mann-Whitney U test. He provides a recap of the concept and approach of permutation tests, referencing a previous video for more details. The data set involves weight gain in chicks on different diets, with 23 observations split between casein and meat meal diets. A box plot is used to visualize the data. Two test statistics are defined for demonstration: the absolute difference in mean weight and the absolute difference in median weight. The video includes R script examples for calculating these statistics.

05:01
πŸ”„ Generating Permutation Samples and Calculating Test Statistics

The script details the process of generating 100,000 permutation samples by reshuffling the weight variable to create new datasets. It explains initializing a matrix to store these samples and using a loop to fill it. The video demonstrates how to calculate the test statistics for each permutation sample, comparing the mean or median weights for the two feed types within each permuted dataset. The process is kept transparent for educational purposes, though it could be optimized with a function. The observed test statistics are compared to the permutation results to estimate the p-value, which is a measure of the probability of observing the test statistic under the null hypothesis.

10:04
πŸ“‰ Interpreting Permutation Test Results and P-Value Calculation

The final part of the script focuses on interpreting the permutation test results, calculating the p-value, and comparing it with the observed test statistics. The p-value is the proportion of permutation test statistics that are more extreme than the observed value, indicating the likelihood of observing such a statistic if the null hypothesis is true. The script includes a step-by-step guide to calculating this value for both test statistics, resulting in p-values that suggest the observed differences are not statistically significant at the 5% level. However, the video emphasizes the importance of considering effect size and power, especially with small sample sizes, and mentions the limitations of permutation tests in constructing confidence intervals, suggesting bootstrapping as an alternative approach.

Mindmap
Keywords
πŸ’‘Permutation Test
A permutation test is a non-parametric statistical test that compares two groups without assuming a specific distribution for the data. It is used to determine if there is a significant difference between the groups by calculating a test statistic and then permuting the data to see how often the test statistic is as extreme as, or more extreme than, the observed value. In the video, the permutation test is used to compare the weight gain of chicks on two different diets, with the test statistics being the absolute difference in means and medians of the weight gain.
πŸ’‘Independent Two-Sample T-Test
The independent two-sample t-test is a statistical method used to determine if there is a significant difference between the means of two groups, assuming the data is normally distributed. The video script mentions this test as an alternative to the permutation test for comparing the numeric variable of weight gain between two feed types.
πŸ’‘Mann-Whitney U Test
Also known as the Wilcoxon rank-sum test, the Mann-Whitney U test is a nonparametric statistical test used to compare the distributions of two groups. It is an alternative to the t-test when the data does not meet the assumptions of normality. The script refers to this test as another method for comparing the weight gain of the two groups of chicks.
πŸ’‘Bootstrapping
Bootstrapping is a resampling technique used to estimate the accuracy of sample statistics by resampling the data with replacement and calculating the statistic of interest multiple times. The video mentions a previous video explaining the bootstrapping approach, which is related to permutation testing but focuses on constructing confidence intervals.
πŸ’‘Test Statistic
A test statistic is a numerical value calculated from sample data during a statistical test. It is used to determine whether the null hypothesis can be rejected. In the video, two test statistics are used: the absolute difference in mean weight and the absolute difference in median weight, to compare the weight gain of chicks on different diets.
πŸ’‘Feed Type
In the context of the video, feed type refers to the different diets given to the chicks, which are casein and meat meal. The script discusses comparing the weight gain of chicks on these two feed types to determine if there is a significant difference in their weight gain after six weeks.
πŸ’‘Weight Gain
Weight gain is the increase in body weight over a period of time. In the video, weight gain is the numeric variable being analyzed to compare the effects of two different feed types on the growth of chicks.
πŸ’‘P-Value
The p-value is the probability of obtaining results at least as extreme as the observed results, assuming that the null hypothesis is true. The video explains how to calculate the p-value using permutation test statistics and interprets the p-values obtained from the tests to determine the significance of the weight gain difference between the two feed types.
πŸ’‘Null Hypothesis
The null hypothesis is a statement of no effect or no difference, which is tested in a statistical study. In the video, the null hypothesis is that there is no difference in weight gain between chicks on the two different feed types.
πŸ’‘Significance Level (Alpha)
The significance level, often denoted as alpha, is the threshold for determining statistical significance in a hypothesis test. If the p-value is less than the significance level, the null hypothesis is rejected. The video script mentions a common alpha level of 5% and discusses the implications of p-values near this threshold.
πŸ’‘Confidence Interval
A confidence interval provides a range of values within which the true population parameter is likely to fall, with a certain level of confidence. While the video focuses on permutation tests, it mentions that bootstrapping, a related technique, can be used to construct confidence intervals for the data set.
Highlights

Introduction to a permutation test as an alternative to the independent two-sample t-test or the Mann-Whitney U test.

Explanation of the concept and general approach of a permutation test in a previous video.

Data set overview with two variables: weight and feed type, comparing weight gain for chicks on two different diets.

Demonstration of box plot analysis to explore data distribution between the two feed types.

Selection of two test statistics for demonstration: absolute difference in mean and median weight.

Calculation of test statistics for casein and meat meal diets and their absolute differences.

Introduction of permutation test setup, including setting a seed for reproducibility.

Initialization of a matrix to store permutation samples and explanation of the permutation process.

Loop implementation to generate 100,000 permutations of the weight variable.

Calculation of test statistics for each permutation sample without using a function for clarity.

Observation of the first 15 permutation test statistics to understand the distribution.

Definition and calculation of p-values for the permutation test statistics.

Interpretation of p-values and their implications for hypothesis testing.

Discussion on the difference between statistical significance and scientific or clinical significance.

Acknowledgment of the small sample size and its impact on the power to detect differences.

Mention of the inability to construct confidence intervals using permutation tests, unlike bootstrapping.

Inclusion of additional code for plotting the sampling distribution and reshuffling labels in the script.

Encouragement to subscribe to the channel and share the video for further exploration of the topic.

Transcripts
Rate This

5.0 / 5 (0 votes)

Thanks for rating: