Hypothesis Testing: One Sided vs Two Sided Alternative | Statistics Tutorial #14 |MarinStatsLectures

MarinStatsLectures-R Programming & Statistics
15 Aug 201808:16
EducationalLearning
32 Likes 10 Comments

TLDRThis transcript discusses the differences between one-sided and two-sided hypothesis testing in the context of comparing mean BMI values between 2008 and 2018. It explains the concept of null and alternative hypotheses, and how to interpret the p-values resulting from each test. The speaker emphasizes the importance of not solely relying on p-values, but also considering effect size, confidence intervals, sample size, and study design when making conclusions in research.

Takeaways
  • πŸ“ˆ The discussion revolves around the differences between one-sided and two-sided hypothesis tests, particularly in the context of comparing mean BMI values between 2008 and 2018.
  • 🎯 The null hypothesis for the example is that the mean BMI in 2018 hasn't changed from the known 2008 value of 25.3.
  • πŸ” A sample of 25 individuals yielded a mean BMI of 27.8 with a standard deviation of 6, which was used to calculate the test statistic.
  • πŸ“Š In the one-sided test, the alternative hypothesis is that the mean BMI in 2018 is greater than the 2008 mean, with a test statistic of 2.08 standard errors above the hypothesized value.
  • 🌟 The one-sided p-value using Z-distribution is 1.9%, and using the t-distribution, it is 2.4%.
  • πŸ”„ For the two-sided test, the alternative hypothesis is that the mean BMI in 2018 is not equal to the 2008 mean, considering deviations in both directions.
  • πŸ€” The two-sided p-value is essentially double the one-sided p-value, which in this case results in 4.8% when using the t-distribution.
  • 🚨 It's emphasized that the p-value should not be the sole determinant in making research conclusions; instead, a holistic approach considering effect size, confidence intervals, sample size, power, and study design is recommended.
  • πŸ“ The script advises against relying too heavily on the 5% alpha level as a cutoff for significance, as the context and other factors should also be taken into account.
  • 🧐 The importance of considering the practical implications and the broader research question, rather than just the p-value, is highlighted.
  • πŸ”Ž The script encourages readers to look beyond the p-value and consider the entire statistical analysis, including the study's limitations and generalizability.
Q & A
  • What is the main topic of the transcript?

    -The main topic of the transcript is the difference between one-sided and two-sided alternative hypothesis tests.

  • What was the known mean BMI in the US in 2008?

    -The known mean BMI in the US in 2008 was 25.3.

  • What was the sample mean BMI found in the 2018 sample of 25 individuals?

    -The sample mean BMI found in the 2018 sample was 27.8.

  • What was the sample standard deviation in the 2018 sample?

    -The sample standard deviation in the 2018 sample was 6.

  • What is the null hypothesis for the one-sided test?

    -The null hypothesis for the one-sided test is that the mean in 2018 hasn't changed from the 2008 mean of 25.3.

  • What is the alternative hypothesis for the one-sided test?

    -The alternative hypothesis for the one-sided test is that the mean in 2018 is now greater than the mean of 2008.

  • What is the test statistic for the example given?

    -The test statistic for the example given is 2.08s, which represents 2.08 standard errors above the hypothesized value.

  • What is the p-value for the one-sided test using Z distribution?

    -The p-value for the one-sided test using Z distribution is 1.9%.

  • What is the p-value for the one-sided test using the t-distribution?

    -The p-value for the one-sided test using the t-distribution is 2.4%.

  • How does the two-sided test differ from the one-sided test?

    -The two-sided test considers the possibility of the mean being either greater than or less than the hypothesized value, whereas the one-sided test only considers one direction of deviation.

  • What is the p-value for the two-sided test using the Z distribution?

    -The p-value for the two-sided test using the Z distribution is double the one-sided p-value, which is approximately 3.8%.

  • What is the significance of the p-value in hypothesis testing?

    -The p-value helps determine the probability of observing the test results under the null hypothesis. A smaller p-value indicates stronger evidence against the null hypothesis, but it is not a definitive yes/no answer and should be considered alongside other factors such as effect size, sample size, and study design.

Outlines
00:00
πŸ“Š Understanding One-Sided vs Two-Sided Hypothesis Testing

This paragraph discusses the distinction between one-sided and two-sided hypothesis tests. It uses the example of testing whether the mean BMI in the US increased from 2008 to 2018. The null hypothesis posits no change, while the one-sided alternative hypothesis suggests an increase. The paragraph explains the concept of standard errors and how they are used to calculate the probability of obtaining a sample mean as extreme as the one observed, under the null hypothesis. It contrasts this with the two-sided test, which considers both directions of deviation from the null hypothesis value. The explanation includes the calculation of p-values for both tests and emphasizes that a two-sided p-value is essentially double the one-sided p-value. The paragraph concludes by highlighting that while p-values are useful, they should not be the sole basis for determining the significance of a research finding.

05:02
🧐 The Importance of Context Beyond P-Values in Statistical Analysis

This paragraph emphasizes the importance of considering context and multiple factors beyond just p-values when interpreting statistical results. It points out that a small p-value does not necessarily mean the null hypothesis is false, and a large p-value does not mean it is true. The paragraph discusses the limitations of using a single alpha level as a cutoff for significance and argues for a more nuanced approach that takes into account effect size, confidence intervals, sample size, study design, and data collection methods. It suggests that decisions about the harmfulness of exposures or the effectiveness of treatments should not rely solely on p-values but should be informed by a comprehensive evaluation of the study's methodology and findings.

Mindmap
Keywords
πŸ’‘Hypothesis Test
A hypothesis test is a statistical method that determines whether a hypothesis about a population parameter is likely true or false. In the context of the video, the hypothesis test is used to evaluate whether the mean BMI in the US has changed from 2008 to 2018. The video discusses both one-sided and two-sided hypothesis tests, which are used to test for increases or any difference, respectively, in the mean value.
πŸ’‘Null Hypothesis
The null hypothesis is a default assumption in statistical hypothesis testing that there is no significant difference or effect. It serves as a starting point for the test, and the goal is to either fail to reject it (suggesting no change or effect) or reject it (indicating a significant change or effect). In the video, the null hypothesis is that the mean BMI in 2018 hasn't changed from the 2008 value of 25.3.
πŸ’‘Alternative Hypothesis
The alternative hypothesis is the opposite of the null hypothesis, stating that there is a significant difference or effect. It is what the researcher is trying to prove with the data. In the video, the one-sided alternative hypothesis suggests that the mean BMI in 2018 is greater than in 2008, while the two-sided alternative hypothesis suggests that the mean BMI is not equal to the 2008 value, allowing for either an increase or decrease.
πŸ’‘Standard Error
Standard error is a measure of the variability of a sample statistic, such as a sample mean, in relation to the variability of the corresponding population parameter. It quantifies the precision of the estimate. In the video, the sample mean BMI is 2.08 standard errors above the hypothesized value of 25.3, indicating a significant difference that could lead to the rejection of the null hypothesis.
πŸ’‘One-Sided Test
A one-sided test, or one-tailed test, is a statistical hypothesis test that checks for an effect in one direction only. It is used when the researcher is interested in a specific type of change, such as an increase or a decrease. In the video, a one-sided test is used to check if the mean BMI in 2018 is greater than the 2008 mean, with a p-value of 1.9% or 2.4% using the Z or t-distribution, respectively.
πŸ’‘Two-Sided Test
A two-sided test, or two-tailed test, is a statistical hypothesis test that checks for an effect in both directions. It is used when the researcher is interested in any difference from the null hypothesis, regardless of the direction. In the video, a two-sided test is used to check if the mean BMI in 2018 is different (either higher or lower) than the 2008 mean, with a p-value double that of the one-sided test, resulting in 3.8% or 4.8% using the Z or t-distribution, respectively.
πŸ’‘P-Value
The p-value, or probability value, is the probability of obtaining a result as extreme or more extreme than the observed result, assuming the null hypothesis is true. It is used to help make decisions about whether to reject the null hypothesis. A lower p-value indicates stronger evidence against the null hypothesis. In the video, the p-values for both one-sided and two-sided tests are discussed, with the two-sided p-value being double the one-sided p-value.
πŸ’‘Effect Size
Effect size is a measure that describes the magnitude of a phenomenon or the strength of the relationship between variables. It provides a standardized measure that can be used to compare the importance of results across different studies. In the video, the effect size is captured by the difference in mean BMI between 2008 and 2018, and it is suggested that researchers should consider the effect size, not just the p-value, when interpreting results.
πŸ’‘Confidence Interval
A confidence interval is a range of values, derived from a statistical procedure, that is likely to contain the value of an unknown parameter with a certain level of confidence. It provides an estimate of the precision of the estimate. In the video, building a confidence interval around the effect size would give a range of plausible values for the mean BMI difference between 2008 and 2018.
πŸ’‘Sample Size
Sample size refers to the number of observations or individuals in a sample. It is a critical factor in statistical analysis because it affects the precision of the results and the power of the test. A larger sample size generally leads to more precise estimates and a higher chance of detecting a true effect. In the video, the sample size of 25 individuals is mentioned, and it is implied that the sample size can impact the margin of error and the power of the study.
πŸ’‘Statistical Power
Statistical power is the probability that a statistical test will detect an effect when there is one. It is influenced by the sample size, the effect size, and the significance level of the test. A higher power means a higher likelihood of correctly rejecting the null hypothesis when it is false. In the video, the concept of power is mentioned as an important factor to consider when interpreting the results of a hypothesis test.
πŸ’‘Research Question
A research question is a specific, focused inquiry that guides the design and conduct of a study. It is the central issue that the research aims to address. In the video, the research question is whether an exposure to a risk factor is harmful or beneficial, and it is emphasized that the decision should not be based solely on a p-value but also on other factors such as effect size, confidence intervals, sample size, and study design.
Highlights

Exploring the differences between one-sided and two-sided hypothesis tests.

Using the example of BMI changes from 2008 to 2018 to illustrate the hypothesis testing process.

The null hypothesis states that the mean BMI in 2018 hasn't changed from the 2008 mean of 25.3.

The sample mean BMI in 2018 is found to be 27.8 with a sample standard deviation of 6.

The test statistic for the one-sided test is 2.08 standard errors above the hypothesized value.

In a one-sided test, the alternative hypothesis suggests the mean BMI in 2018 is greater than in 2008.

For the two-sided test, the alternative hypothesis suggests the mean BMI in 2018 is not equal to 25.3.

The one-sided test shows a 1.9% probability of observing a sample mean as extreme as 27.8 if the null hypothesis is true.

Using the t-distribution, the one-sided p-value is 2.4%, and for the two-sided test, it is 4.8%.

The two-sided test is considered more conservative and fair as it accounts for deviations in both directions.

The p-value in a two-sided test is essentially double the one-sided p-value.

The concept of p-value should not be the sole determinant in making research conclusions.

Effect size, confidence intervals, sample size, and study design are all crucial factors in interpreting research findings.

The American Statistical Association's statement on p-values provides a deeper discussion on their appropriate use.

Decisions on harm or benefit from exposures should not rely on a single p-value but on a comprehensive analysis.

The importance of considering the context and multiple factors when making decisions based on research findings.

The p-value is a useful tool but not definitive in determining the truth of a research question.

The discussion emphasizes the need to move beyond a single-number conclusion towards a more nuanced understanding of research outcomes.

Transcripts
Rate This

5.0 / 5 (0 votes)

Thanks for rating: