Is Most Published Research Wrong?
TLDRThis script delves into the reliability of scientific research, questioning the validity of findings due to the misuse of p-values and the publish-or-perish culture in academia. It highlights the 'replicability crisis', showing that many prominent studies, including those in psychology and particle physics, have failed to replicate. The script also discusses the concept of 'p-hacking' and the incentives that lead researchers to prioritize publication over accuracy, but concludes with a note of optimism about recent efforts to improve scientific integrity.
Takeaways
- ๐ฎ The script discusses a study suggesting humans can see into the future, but questions the validity of such claims based on statistical significance.
- ๐ It highlights the use of p-values in determining the significance of study results, with a p-value of .01 indicating a 1% chance of the observed effect being due to chance.
- ๐ง The script challenges the common threshold of p < .05 for statistical significance, pointing out that it was arbitrarily chosen and may not be a reliable standard.
- ๐ค It raises concerns about the prevalence of false positives in scientific research due to factors like publication bias and the high rate of false hypotheses being tested.
- ๐งฌ The script mentions the 'Reproducibility Project' which found low rates of replication success in psychology studies, questioning the reliability of published research.
- ๐ซ It uses a humorous example of a study claiming chocolate aids weight loss to illustrate the concept of 'p-hacking' and how small sample sizes can lead to misleading results.
- ๐ฌ The script points out that even in fields with stringent statistical requirements, like particle physics, false discoveries can occur due to biases in data interpretation.
- ๐ It discusses the incentives in scientific research that favor novel and statistically significant findings, potentially leading to an overemphasis on positive results.
- ๐ The script acknowledges the challenges in replicating studies and the reluctance of journals to publish replication studies, which can hinder scientific self-correction.
- ๐ It emphasizes the importance of peer review and methodical rigor in scientific research, despite the inherent flaws and the potential for incorrect conclusions.
- ๐ก Lastly, the script concludes by reflecting on the human tendency to delude ourselves and the value of the scientific method as a more reliable way of knowing compared to other methods.
Q & A
What was the title of the article published in the 'Journal of Personality and Social Psychology' in 2011?
-The title of the article was 'Feeling the Future: Experimental Evidence for Anomalous Retroactive Influences on Cognition and Affect'.
What was the main claim of the 2011 study regarding the ability of people to see into the future?
-The main claim was that there was experimental evidence suggesting that people could have anomalous retroactive influences on cognition and affect, essentially implying the ability to see into the future.
What was the hit rate for participants when selecting the curtain with an image behind it, and what was considered significant?
-The hit rate for participants was 53% when selecting the curtain with an erotic image, which was considered significant because it was higher than the expected 50% chance level.
What is a p-value and how is it used to assess the significance of study results?
-A p-value is a statistical measure that indicates the probability of obtaining results at least as extreme as the observed results, assuming that the null hypothesis is true. It is used to assess the significance of study results, with values less than 0.05 generally considered significant.
What is the issue with using a p-value threshold of .05 for determining statistical significance?
-Using a p-value threshold of .05 can lead to a high rate of false positives, especially when multiple hypotheses are being tested or when there is publication bias towards positive results.
What is 'p-hacking' and how does it increase the likelihood of false positives in research?
-P-hacking refers to the manipulation of data or statistical analysis methods to achieve a p-value below the threshold of significance (typically .05). This can involve selecting or excluding data points, changing analysis methods, or considering multiple variables, which increases the likelihood of finding at least one significant result by chance.
What was the result of the Reproducibility Project that attempted to replicate 100 psychology studies?
-The Reproducibility Project found that only 36% of the psychology studies had statistically significant results when replicated, indicating a significant issue with the reproducibility of published research.
What is the '5-sigma' standard used in particle physics and why is it significant?
-The '5-sigma' standard is a stringent requirement for statistical significance used in particle physics, which corresponds to a probability of less than 0.0000035 of obtaining a false positive. It is significant because it greatly reduces the likelihood of claiming a discovery based on random chance.
What is the impact of publication bias on the reproducibility of scientific findings?
-Publication bias, where journals preferentially publish studies with statistically significant positive results, can lead to an overrepresentation of false positives in the scientific literature and makes it difficult for researchers to assess the true validity of findings.
What steps are being taken to address the reproducibility crisis in science?
-Steps being taken include conducting large-scale replication studies, establishing platforms like Retraction Watch to publicize withdrawn papers, creating online repositories for unpublished negative results, and adopting practices like pre-registering hypotheses and methods for peer review before experiments.
Why is it important to consider the potential for error even when using the scientific method?
-It is important because even with rigorous methods, errors can occur due to various factors such as p-hacking, publication bias, and the complexity of interpreting data. Recognizing this helps maintain a critical approach to scientific findings and encourages continuous improvement in research practices.
Outlines
๐ฎ The Illusion of Future Sight in Scientific Studies
This paragraph discusses a controversial study published in the 'Journal of Personality and Social Psychology' that suggests humans may possess the ability to see into the future. The study involved nine experiments where participants predicted which of two curtains hid an image. The hit rate for erotic images was slightly higher than chance, leading to a p-value of .01, which is considered significant in scientific research. However, the paragraph questions the validity of this result, explaining that a p-value of less than .05 is typically seen as significant but may not be enough to accept extraordinary claims like perceiving the future. It also delves into the broader issue of false positives in published research, highlighting that the commonly used 5% threshold for statistical significance may not be stringent enough, and that the actual rate of false positives could be much higher due to factors like publication bias and the prevalence of 'p-hacking'.
๐ง The Perils of P-Hacking and Replication in Scientific Research
The second paragraph expands on the problem of false positives in scientific research, illustrating how p-hacking can lead to misleading results. It uses the example of a study claiming that eating chocolate aids weight loss, which was later revealed to be a result of small sample size and multiple measurements increasing the chance of false positives. The paragraph explains how researchers can manipulate data analysis to achieve significant p-values, even when there is no real effect. It also touches on the high standards of statistical significance in particle physics and the infamous case of the pentaquark, which was initially confirmed by multiple experiments but later debunked as a false discovery due to biased data analysis. The paragraph emphasizes the importance of replication in science but points out the challenges and biases that can hinder the process, including the reluctance of journals to publish replication studies and the pressure on researchers to produce novel and significant findings.
๐ ๏ธ Towards Improvement: Addressing the Reproducibility Crisis in Science
The final paragraph acknowledges the ongoing reproducibility crisis in science but highlights positive changes in the scientific community's approach to research. It mentions large-scale replication studies, the Retraction Watch website, and the use of online repositories for sharing negative results. The paragraph also discusses the move towards pre-registering studies, which can help reduce publication bias and p-hacking by ensuring that research is published regardless of outcomes, provided the methodology is sound. The narrator reflects on the human tendency to be misled, even with rigorous scientific methods, and emphasizes the importance of science as a reliable method for understanding the world, despite its flaws. The paragraph concludes with a thank you to supporters and a promotion for Audible.com, offering a free trial and recommending a specific book.
Mindmap
Keywords
๐กAnomalous Retroactive Influences
๐กp-value
๐กStatistical Significance
๐กReproducibility
๐กNull Hypothesis
๐กFalse Positives
๐กp-hacking
๐กPublication Bias
๐กReplication Studies
๐ก5-sigma
๐กPeer Review
Highlights
In 2011, a study was published suggesting that people can see into the future, with a hit rate of 53% for erotic images, which was statistically significant with a p-value of .01.
The significance of p-values in determining whether a result is due to chance or a true effect, with a common threshold of .05 for publication.
The potential for a large portion of published research to be false, especially when considering the number of hypotheses tested and the statistical power of experiments.
The 'Why most published research is false' paper from 2005, highlighting the issues with the prevalence of false positives in scientific literature.
The Reproducibility Project's findings that only 36% of psychology studies could be statistically significantly replicated.
The challenge of replicating landmark cancer studies, with only 6 out of 53 studies successfully reproduced.
The phenomenon of 'p-hacking', where researchers manipulate data analysis to achieve statistically significant results.
The example of a study claiming that eating chocolate daily helps with weight loss, which was intentionally designed to increase the likelihood of false positives.
The issue of publication bias, where journals are more likely to publish studies with statistically significant results.
The incentives for scientists to publish novel and unexpected results, which can lead to an increase in tested hypotheses with a lower ratio of true relationships.
The difficulty of replicating studies and the reluctance of journals to publish replication studies, which hinders the self-correction of science.
The case of the pentaquark particle, where initial evidence was found but later studies could not confirm its existence, illustrating the problem of false discoveries in science.
The role of data interpretation in scientific research and how different research groups can draw different conclusions from the same data.
The steps being taken to address the reproducibility crisis in science, including large-scale replication studies and initiatives to publish null results.
The movement towards pre-registering hypotheses and methods for peer review before conducting experiments, aiming to reduce publication bias and p-hacking.
The reflection on the reliability of the scientific method despite its flaws, and the importance of using rigorous mathematical tools in the pursuit of truth.
The support for the video from Patreon and Audible.com, offering a free 30-day trial and highlighting the recommended book 'The Invention of Nature'.
Transcripts
Browse More Related Video
The Replication Crisis: Crash Course Statistics #31
The scandal that shook psychology to its core
Bias Detection (in Meta-Analyses)
Unit 1: Scientific Foundations of Psychology, AP Psych Exam Cram, Multiple Choice Practice Questions
05 - Using P-Values in Hypothesis Testing (Compare P Value to Level of Significance)
P-Hacking: Crash Course Statistics #30
5.0 / 5 (0 votes)
Thanks for rating: