Paired t-Test in R with Examples | R Tutorial 4.7 | MarinStatsLectures

MarinStatsLectures-R Programming & Statistics
25 Aug 201304:20
EducationalLearning
32 Likes 10 Comments

TLDRIn this video, Mike Marin teaches how to perform a paired t-test and calculate confidence intervals using R programming. The tutorial uses systolic blood pressure data to demonstrate the analysis of differences between paired observations before and after treatment. The video guides viewers through data visualization, hypothesis testing with the 't.test' function, and interpretation of the results, including the test statistic, p-value, and confidence interval. It also highlights the importance of data order in the analysis and concludes with a teaser for the next video on the Wilcoxon Signed Rank test.

Takeaways
  • πŸ“š The video is an educational tutorial by Mike Marin on conducting a paired t-test and confidence interval using R Programming Language.
  • πŸ” It focuses on the statistical analysis of paired or dependent populations, exemplified by measurements of systolic blood pressure before and after treatment.
  • πŸ“ˆ The data set consists of 25 paired observations detailing subject numbers, before measurements, and after measurements.
  • πŸ“Š The 't.test' function in R is introduced for performing the paired t-test, with guidance on accessing help for R commands.
  • πŸ“‰ A boxplot is suggested for initial data examination, indicating a potential average decrease in blood pressure post-treatment.
  • πŸ“ The tutorial demonstrates creating a scatterplot to visualize paired measurements and the changes in individuals.
  • πŸ“ A 45-degree line is added to the scatterplot to represent the 'no change' scenario, helping to visually assess the treatment effect.
  • βš–οΈ The paired t-test is conducted with a null hypothesis that the mean difference in blood pressure is zero, using a two-sided alternative hypothesis.
  • πŸ”’ The 'mu' argument is set to 0, 'alt' to 'two.sided', 'paired' to TRUE, and 'conf.level' to 99% in the 't.test' function for the analysis.
  • πŸ“Š The results include a test statistic of 3.88, a p-value of 0.000698, a 99% confidence interval from 2.245 to 13.754, and a sample mean difference of 8.
  • πŸ”„ The importance of the order of input for before and after measurements is highlighted, noting that it affects the sign of the results.
  • πŸŽ₯ The next video in the series will cover the Wilcoxon Signed Rank test, a nonparametric alternative to the paired t-test.
Q & A
  • What is the main topic of the video presented by Mike Marin?

    -The main topic of the video is how to conduct a paired t-test and calculate the confidence interval using R Programming Language.

  • What are the parametric methods discussed in the video for examining the difference in means of two populations?

    -The parametric methods discussed are the paired t-test and confidence interval, which are appropriate for examining the difference in means for two populations that are paired or dependent on each other.

  • What type of data does the video use as an example for demonstrating the paired t-test?

    -The video uses data involving measurements on systolic blood pressure before and after receiving some treatment as an example.

  • How many paired observations are included in the data set used in the video?

    -The data set consists of 25 paired observations.

  • What R command/function is used to conduct the paired t-test in the video?

    -The 't.test' command/function in R is used to conduct the paired t-test.

  • How can one access help for a specific command in R as mentioned in the video?

    -To access help for a specific command in R, you can type 'help' followed by the command name in brackets, or simply place a question mark (?) in front of the command name.

  • What is the purpose of examining a boxplot of the data before performing the paired t-test?

    -Examining a boxplot of the data allows one to visually compare the before and after measurements and see some evidence of the average change in blood pressure after treatment.

  • What is the significance of the 45-degree line added to the scatterplot in the video?

    -The 45-degree line, which represents the line where X equals Y or before equals after, helps visualize if there is a change in blood pressure. Points falling below the line indicate a decrease in blood pressure after treatment.

  • What is the null hypothesis used in the paired t-test conducted in the video?

    -The null hypothesis used is that the mean difference in systolic blood pressure is zero.

  • What is the significance of the p-value obtained from the paired t-test in the video?

    -The p-value of 0.000698 indicates a very low probability of observing the data if the null hypothesis were true, suggesting that there is a statistically significant difference in systolic blood pressure before and after treatment.

  • What does the 99 percent confidence interval for the mean difference indicate in the context of the video?

    -The 99 percent confidence interval, ranging from 2.245 to 13.754, provides a range of values within which the true mean difference in systolic blood pressure is likely to fall, with 99% confidence.

  • How does entering the 'after' measurements before the 'before' measurements affect the results of the paired t-test?

    -Entering the 'after' measurements before the 'before' measurements will result in the calculation of the difference as after minus before, changing the sign of the mean difference but not significantly affecting the overall results.

  • What is the next topic that will be discussed in the video series?

    -The next video in the series will discuss the Wilcoxon Signed Rank test, which is a nonparametric equivalent to the paired t-test.

Outlines
00:00
πŸ“Š Introduction to Paired t-test and Confidence Interval in R

In this video, Mike Marin introduces the concept of the paired t-test and confidence interval, which are statistical methods used to examine the differences in means between two related groups. The context provided is a study involving systolic blood pressure measurements before and after a treatment. The data consists of 25 paired observations. The video will demonstrate how to use R programming language to conduct these analyses, starting with importing and examining the data, and then proceeding to visualize the differences using boxplots and scatterplots. The significance of the 45-degree line in scatterplots is explained, which helps to visually assess changes in blood pressure.

Mindmap
Keywords
πŸ’‘Paired t-test
A paired t-test is a statistical method used to compare the means of two related groups to determine if there is a significant difference between them. In the video, it is used to examine the change in systolic blood pressure before and after treatment, which are paired observations. The script mentions using the 't.test' function in R to perform this test, setting the 'paired' argument to TRUE to indicate the paired nature of the data.
πŸ’‘Confidence Interval
A confidence interval provides a range of values that are likely to contain a population parameter with a certain level of confidence. In the context of the video, a 99% confidence interval is calculated for the mean difference in systolic blood pressure, indicating the range within which the true mean difference is likely to fall. The script describes the interval as running from 2.245 to 13.754.
πŸ’‘R Programming Language
R is a programming language and environment commonly used for statistical computing and graphics. The video script discusses how to conduct a paired t-test and calculate a confidence interval specifically using R. The script provides examples of R commands such as 't.test' and 'abline' to perform statistical analysis and visualization.
πŸ’‘Systolic Blood Pressure
Systolic blood pressure is the maximum pressure in the arteries when the heart contracts and is a key measure of cardiovascular health. The video's data involves measurements of systolic blood pressure before and after a treatment, which is the central focus of the statistical analysis being discussed.
πŸ’‘Paired Observations
Paired observations refer to data points that are related or matched, such as measurements taken on the same subjects before and after an event. In the video, the script mentions that the data consists of 25 paired observations of systolic blood pressure, which are essential for conducting a paired t-test.
πŸ’‘Boxplot
A boxplot is a graphical representation of the distribution of a dataset, displaying the median, quartiles, and potential outliers. The script suggests examining a boxplot of the data to compare before and after measurements of blood pressure, providing a visual summary that can indicate trends or changes.
πŸ’‘Scatterplot
A scatterplot is a type of plot that displays the values of two variables for a set of data. In the video, a scatterplot is used to visualize the before and after measurements of systolic blood pressure, allowing viewers to see the relationship and differences between the paired observations.
πŸ’‘45-Degree Line
The 45-degree line, also known as the line of equality, is a diagonal line on a scatterplot where the x and y values are equal. In the script, the 'abline' function in R is used to add this line to the scatterplot, helping to visualize whether there is a change in blood pressure, as points below the line indicate a decrease after treatment.
πŸ’‘Null Hypothesis
The null hypothesis is a statement of no effect or no difference, which is tested in a statistical hypothesis test. In the video, the null hypothesis is that the mean difference in systolic blood pressure is zero, which the paired t-test is used to evaluate.
πŸ’‘Alternative Hypothesis
The alternative hypothesis is a statement that contradicts the null hypothesis and represents what the researcher believes to be true. In the video, the alternative hypothesis is set to 'two.sided', indicating that the test is looking for any difference, not just an increase or decrease.
πŸ’‘Wilcoxon Signed Rank Test
The Wilcoxon Signed Rank test is a nonparametric statistical test used to compare two related samples. It is mentioned in the script as the topic of the next video, serving as an alternative to the paired t-test when the data does not meet the assumptions required for parametric testing.
Highlights

Mike Marin presents a tutorial on conducting a paired t-test and confidence interval using R Programming Language.

The paired t-test and confidence interval are parametric methods for examining the difference in means of paired populations.

The tutorial uses data on systolic blood pressure measurements before and after treatment.

There are 25 paired observations with subject numbers, before, and after measurements.

Data has been imported into R and is ready for analysis.

A boxplot is used to examine the change in systolic blood pressure from before to after treatment.

Evidence suggests that blood pressure is lower on average after treatment.

A scatterplot is produced to visualize paired before and after measurements.

A 45-degree line is added to the scatterplot to represent no change in blood pressure.

Points falling below the line indicate a decrease in blood pressure after treatment.

The 't.test' command in R is used to conduct the paired t-test.

Null hypothesis states that the mean difference in systolic blood pressure is zero.

A two-sided alternative hypothesis is used for the test.

The 'paired' argument is set to TRUE to indicate paired data.

A 99% confidence level is used for the test.

Results show a test statistic of 3.88 and a p-value of 0.000698, indicating significance.

The 99% confidence interval for the mean difference is from 2.245 to 13.754.

The sample mean difference is 8, suggesting a decrease in blood pressure after treatment.

The order of entering before and after measurements does not significantly change the results.

The next video will discuss the Wilcoxon Signed Rank test, a nonparametric alternative to the paired t-test.

The tutorial concludes with an invitation to subscribe for more statistics and R programming language content.

Transcripts
Rate This

5.0 / 5 (0 votes)

Thanks for rating: