The Replication Crisis: Crash Course Statistics #31

CrashCourse

26 Sept 201814:36

EducationalLearning

32 Likes 10 Comments

TLDRThe video discusses the 'replicability crisis' in scientific research, where many published studies fail to have their results replicated or reproduced by other researchers. It explores reasons for this crisis, including misuse of p-values, pressure to publish splashy results, small sample sizes, and lack of data sharing. The video argues more replication studies are needed to weed out false results, but incentives must change to encourage this unglamorous work. It emphasizes that no single study proves a scientific truth; rather, the collective process of conducting and replicating research brings us closer to understanding reality.

Takeaways

😱 There is a replicability crisis in scientific research - many published studies cannot be replicated or reproduced.
😥 Less than half of published psychology studies could be replicated in one study.
🤔 There are issues with understanding of p-values that lead to questionable conclusions.
😠 Some non-replication is due to fraud or questionable research practices.
😕 But even well-intentioned research can be irreproducible due to differences in analysis.
🙄 Publication bias towards positive, novel findings makes non-replication likely.
😌 Replication helps distinguish real effects from flukes.
🔬 More replication is needed, but not incentivized by funding or institutions.
🚨 Lower p-value thresholds could reduce false positives.
📊 Data sharing and transparency guidelines can aid reproducibility.

Q & A

What percentage of studies were Amgen scientists able to replicate in the cancer treatment replication study?
-The Amgen scientists were only able to replicate the original results 11-percent of the time.
What did the American Statistical Association statement in 2016 aim to do regarding p-values?
-The statement aimed to help researchers better understand and use P values, including that conclusions should not be based solely on whether a p-value passes a threshold and that p-values do not measure the importance of a result.
How can the bias towards publishing significant results contribute to the replication crisis?
-Studies that show promising but fluky significant results are more likely to get published, but may not be reproducible when repeated without the fluke occurrence.
What are some proposed solutions to improve reproducibility in research?
-Proposed solutions include more funding and incentives for replication studies, publishing null results, reevaluating p-value thresholds, researchers sharing data more openly, and journals adopting policies emphasizing reproducibility.
What was the false discovery rate in the hypothetical social priming example?
-The false discovery rate was 45 out of 105 significant results, or 42.9% - meaning almost half of the published significant effects were false positives.
What is the value of the replication crisis and debate over issues like power posing?
-It shows the importance of replication in refining and progressing scientific understanding over time through the iterative process of building on previous research.
What percentage of researchers surveyed considered there to be a reproducibility crisis in science?
-90% - with 52% calling it a "significant crisis" and 38% calling it a "slight crisis".
How can unclear analysis methods contribute to irreproducibility?
-If researchers don't fully explain their data analysis methods, it makes it harder for others to reproduce their results even using the same data.
What are some examples of unscrupulous research practices that hurt reproducibility?
-Examples include falsifying data, intentional p-hacking, and being more concerned with splashy published headlines than sound science.
How can small sample sizes contribute to the replication crisis?
-Studies with fewer subjects are more likely to produce skewed, unreplicable results that may not hold up when repeated.

Outlines

00:00

📝 Intro to the concepts of replication and reproducibility in scientific research

This paragraph introduces the concepts of replication and reproducibility in scientific research. It discusses why these are essential for ensuring research findings are valid and scientifically sound. Examples are given of studies across fields like biomedicine and psychology that have struggled to replicate original published results.

05:05

📊 Understanding p-values and statistical significance

This paragraph digs deeper into concepts related to p-values, statistical significance thresholds, and proper interpretation of results. It references statements from the American Statistical Association about avoiding overreliance on p-values alone when drawing conclusions. The challenges of small sample sizes and publication bias towards positive results are also discussed.

10:06

👩‍🔬 Steps towards improving reproducibility in research

This closing paragraph explores potential solutions to improve reproducibility in research. Suggestions include conducting more replication studies, reconsidering standard p-value thresholds, encouraging data sharing, and enhancing journal policies around transparency. It wraps up with a discussion on how the back and forth debate and iterations on ideas are all part of the process of scientific progress.

Mindmap

Keywords

💡replication

Replication refers to re-running studies to confirm previous results. It is an essential part of the scientific process to ensure findings are valid before they are used as the basis for decisions or further research. The video discusses issues with replication in research fields like medicine and psychology, where key studies have failed to be replicated at concerning rates.

💡reproducibility

Reproducibility refers to the ability for other scientists to repeat the analyses done on a dataset and obtain consistent results. The video gives an example where 29 teams analyzed the same dataset but came to different conclusions, demonstrating issues with reproducibility even when working with identical raw data.

💡p-values

P-values indicate the statistical significance of results, but do not directly measure the size or importance of an effect. The video discusses common misinterpretations of p-values that have contributed to non-reproducible study results.

💡false positives

False positives refer to studies where a statistically significant effect is detected, but there is actually no real effect present. Using a p-value threshold of 0.05 can result in a high false positive rate. The video uses a hypothetical example to illustrate this.

💡incentives

The video argues that few incentives exist for scientists to carry out replication studies, which are expensive and yield less media attention or institutional rewards compared to novel discoveries. Changing these incentives could help address reproducibility issues.

💡transparency

Improving transparency around research methods and data analysis procedures makes it easier for other teams to accurately replicate or reproduce studies. Some journals and funders now emphasize transparency to help address reproducibility concerns.

💡power posing

A high-profile study on "power posing" and its effects failed to be replicated in subsequent studies. This case demonstrates the value of replication in identifying initially promising results that do not hold up to further scrutiny.

💡scientific process

The scientific process involves constant iteration, building on previous ideas, and refining past results. The reproducibility crisis demonstrates that more researchers are taking the essential replication step seriously to enhance scientific progress.

💡public trust

Issues with reproducibility can undermine public trust in science. Journals and institutions recognize this, and have made efforts to address reproducibility concerns by promoting transparency.

💡causation

The video notes that effects like smoking causing cancer have been firmly established through repeated observational studies, even though directly demonstrating causation via randomized trials is impossible. This illustrates the cumulative power of replication.

Highlights

Replication studies are essential to confirm research results

In one study, only 11% of major cancer treatment studies could be replicated

In a psychology study replication, less than half of published results were replicated

90% of researchers surveyed think there is a crisis related to reproducibility

Unclear analysis methods make reproducibility difficult even with the same data

Misuse of p-values leads to overstated conclusions not supported by the data

Published studies often overestimate effects due to publication bias

Small sample sizes lead to skewed, unreplicable results

More replication studies are needed, despite being expensive and less valued

Publishing null results could reduce publication bias

Stricter p-value thresholds could reduce false positives

Sharing data publicly makes reproducibility easier

Journals adopting reproducibility guidelines helps boost public trust

Power posing study controversially claimed confidence boosting effects

Replication and debate brings science closer to the truth

Transcripts

Browse More Related Video

Is Most Published Research Wrong?

The scandal that shook psychology to its core

P-Hacking: Crash Course Statistics #30

PRACTICAL RESEARCH 1 - Characteristics of Research - EP.2 (Research Simplified)

Statistical Significance and p-Values Explained Intuitively

Calculate the P-Value in Statistics - Formula to Find the P-Value in Hypothesis Testing

The Replication Crisis: Crash Course Statistics #31

Takeaways

Q & A

What percentage of studies were Amgen scientists able to replicate in the cancer treatment replication study?

What did the American Statistical Association statement in 2016 aim to do regarding p-values?

How can the bias towards publishing significant results contribute to the replication crisis?

What are some proposed solutions to improve reproducibility in research?

What was the false discovery rate in the hypothetical social priming example?

What is the value of the replication crisis and debate over issues like power posing?

What percentage of researchers surveyed considered there to be a reproducibility crisis in science?

How can unclear analysis methods contribute to irreproducibility?

What are some examples of unscrupulous research practices that hurt reproducibility?

How can small sample sizes contribute to the replication crisis?