Chi-Square Test, Fisherβs Exact Test, & Cross Tabulations in R | R Tutorial 4.10| MarinStatsLectures
TLDRIn this educational video, Mike Marin teaches viewers how to perform the chi-square test of independence and Fisher's exact test using R programming language. He uses lung capacity data to illustrate the relationship between gender and smoking, demonstrating the creation of a contingency table, visualizing data with bar plots, and conducting statistical tests. Marin also discusses using the 'CHISQ.Test' and 'Fisher.test' functions in R, including the application of Yates' continuity correction and confidence intervals for the odds ratio. The video concludes with a teaser for upcoming content on calculating relative risks and odds ratios.
Takeaways
- π Introduction: Mike Marin is presenting a tutorial on performing chi-square tests and Fisher's exact test in R programming language.
- π Chi-Square Test: The chi-square test of independence is a statistical method used to test the independence between two categorical variables.
- π Data Exploration: The tutorial uses lung capacity data to explore the relationship between gender and smoking habits.
- π Importing Data: The data has been imported and attached in R for analysis.
- π Contingency Table: The 'table' function in R is used to create a contingency table for the analysis.
- π Visual Representation: A bar plot is generated to visually examine the relationship between variables, using the 'barplot' command with 'beside' and 'legend' arguments.
- π§ CHISQ.Test: The chi-square test is conducted using the 'CHISQ.Test' function, with the option to apply Yate's continuity correction.
- π Test Results: The test statistic and p-value are presented, and the results can be stored in an object for further analysis.
- π Attributes: The 'attributes' function can be used to explore and extract specific attributes from the test results object.
- π€ Fisher's Exact Test: When chi-square test assumptions are not met, Fisher's exact test is an alternative nonparametric method.
- π’ Fisher.Test: This test is performed using the 'Fisher.test' function, with options to include a confidence interval and set the confidence level.
- π Future Content: The next video will discuss packages for calculating relative risks and odds ratios.
Q & A
What is the main topic of the video by Mike Marin?
-The video is about conducting the chi-square test of independence and Fisher's exact test using the R programming language.
What is the chi-square test of independence used for?
-The chi-square test of independence is a parametric method used for testing the independence between two categorical variables.
What data set is used in the video for the example?
-The lung capacity data set, which was introduced earlier in the series, is used for the example in the video.
What variables' relationship is explored in the video?
-The video explores the relationship between gender and smoking using the lung capacity data.
How does one produce a contingency table in R?
-A contingency table can be produced in R using the 'table' command or function.
What is the purpose of the 'CHISQ.Test' command/function in R?
-The 'CHISQ.Test' command/function in R is used to perform the chi-square test for a contingency table.
What is the Yate's continuity correction and when is it used in the chi-square test?
-Yate's continuity correction is a method used to adjust the chi-square test statistic when the expected frequencies in the contingency table are too small, and it is set by the 'correct' argument in the 'CHISQ.Test' function.
What does the 'Fisher.test' command in R do?
-The 'Fisher.test' command in R performs Fisher's exact test, which is a nonparametric alternative to the chi-square test.
What is the purpose of the 'conf.int' and 'conf.level' arguments in the 'Fisher.test' function?
-The 'conf.int' argument is used to request a confidence interval for the odds ratio, and 'conf.level' is used to set the desired level of confidence for the interval.
How can one visualize the relationship between variables before the chi-square test?
-A bar plot can be used to visualize the relationship between variables, which can be produced using the 'barplot' command in R.
What is the significance of the p-value in the chi-square test?
-The p-value in the chi-square test indicates the probability of observing the data, or something more extreme, assuming the null hypothesis of independence is true. A higher p-value suggests that the null hypothesis cannot be rejected.
What should one do if the assumptions for the chi-square test are not met?
-If the assumptions for the chi-square test are not met, such as having small expected frequencies, one may consider using Fisher's exact test as an alternative.
What are the additional statistical measures that will be discussed in the next video?
-The next video will discuss a package for calculating relative risks, odds ratios, and other statistical measures.
Outlines
π Introduction to Chi-Square and Fisher's Tests in R
In this video, Mike Marin introduces viewers to statistical tests for independence using R programming language. He explains the 'chi-square test of independence' and 'Fisher's exact test', focusing on their application with categorical variables. The example data on lung capacity is used to explore the relationship between gender and smoking habits. Marin demonstrates how to import data, create a contingency table using the 'table' function, and visualize data with a bar plot. He also guides on performing the chi-square test with the 'CHISQ.Test' command, including the use of Yate's continuity correction, and storing the results for further analysis.
Mindmap
Keywords
π‘Chi-square test of independence
π‘R programming language
π‘Contingency table
π‘Bar plot
π‘Test statistic
π‘P-value
π‘Yate's continuity correction
π‘Attributes
π‘Fisher's exact test
π‘Confidence interval
π‘Odds ratio
Highlights
Introduction to the chi-square test of independence and Fisher's exact test using R programming language.
The chi-square test is a parametric method for testing independence between two categorical variables.
Using lung capacity data to explore the relationship between gender and smoking.
Importing and attaching data in R for analysis.
Using the 'CHISQ.Test' function to perform the chi-square test in R.
Accessing help in R for specific commands or functions.
Creating a contingency table using the 'table' command/function in R.
Saving the contingency table in an object called 'TAB' for later use.
Visual examination of the relationship using a bar plot with the 'barplot' command.
Setting the 'beside' argument to True for clustered bar charts.
Producing a default legend with the 'legend' argument set to True.
Conducting the chi-square test with the 'correct' argument for Yate's continuity correction.
Storing the test results in an object named 'CHI'.
Using the 'attributes' function to explore what R stored in the 'CHI' object.
Extracting certain attributes from the 'CHI' object using the '$' sign.
Considering Fisher's exact test when chi-square test assumptions are not met.
Using the 'Fisher.test' command for nonparametric analysis equivalent to the chi-square test.
Setting 'conf.int' to True for a confidence interval of the odds ratio.
Adjusting the 'conf.level' argument for the desired level of confidence.
Upcoming discussion on a package for calculating relative risks and odds ratios in the next video.
Encouragement to subscribe for more R programming and statistics videos.
Transcripts
Browse More Related Video
Odds Ratio, Relative Risk & Risk Difference with R | R Tutorial 4.11| MarinStatsLectures
One-Sample t Test & Confidence Interval in R with Example | R Tutorial 4.1| MarinStatsLectures
Paired t-Test in R with Examples | R Tutorial 4.7 | MarinStatsLectures
Stacked and Grouped Bar Charts and Mosaic Plots in R |R Tutorial 2.6| MarinStatsLectures
Correlations and Covariance in R with Example | R Tutorial 4.12 | MarinStatsLectures
Two-Sample t Test in R (Independent Groups) with Example | R Tutorial 4.2 | MarinStatsLectures
5.0 / 5 (0 votes)
Thanks for rating: