scientific-critical-thinking

Evaluate scientific claims and evidence quality. Use for assessing experimental design validity, identifying biases and confounders, applying evidence grading frameworks (GRADE, Cochrane Risk of Bias), or teaching critical analysis. Best for understanding evidence quality, identifying flaws. For formal peer review writing use peer-review.

Category

Other Tools

Install

Hot:6

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=k-dense-ai-scientific-skills-scientific-critical-thinking&locale=en&source=copy

Scientific Critical Thinking

Skill Overview

A professional tool for systematically evaluating the quality of scientific research, used to review experimental design, identify biases, assess levels of evidence, and apply GRADE and Cochrane frameworks to judge the reliability of conclusions.

Applicable Scenarios

  • Paper Quality Assessment: Systematically check the methodological rigor, statistical validity, and whether conclusions are adequately supported by data when reading literature personally or before formal peer review.
  • Study Design Planning: Before starting a new study, obtain professional advice on randomization, blinding, sample size calculation, control of confounding factors, and other aspects to help design a more rigorous experimental protocol.
  • Evidence Synthesis Judgment: When conducting systematic reviews or meta-analyses, assess the level of evidence of included studies, judge quality differences among studies, and determine the credibility of conclusions.
  • Core Functions

  • Methodological Critique: Thoroughly evaluate whether the study design supports the research question, check internal validity, external validity, construct validity, and statistical conclusion validity, and identify issues in randomization, blinding, and control group setup.
  • Bias Identification: Systematically detect cognitive biases (such as confirmation bias, HARKing), selection bias, measurement bias, analysis bias (such as P-hacking, selective reporting), and confounding, and assess their impact on results.
  • Evidence Quality Assessment: Use the GRADE system and evidence grading frameworks to evaluate study design types, risk of bias, consistency of results, indirectness, and precision, and determine the credibility level of the evidence.
  • Frequently Asked Questions

    What is the difference between scientific critical thinking and formal peer review?

    Scientific critical thinking is mainly used for personal research evaluation, judging evidence quality, or internal quality checks before peer review. It does not produce formal review reports or feedback for authors. If you need to write an official peer review report, you should use the peer-review skill.

    How can you judge whether an observational study supports causal conclusions?

    Observational studies alone cannot directly prove causation. You need to assess: temporal order (whether the cause precedes the effect), whether there is a dose-response relationship, whether confounding factors are controlled, whether the biological mechanism is plausible, and whether different studies are consistent. If a paper uses correlational language ("correlated", "associated") to state causal conclusions, that is a red flag.

    What are the "downgrade" criteria in GRADE assessment?

    The GRADE system starts from study design type (RCTs initially rated as high quality, observational studies initially as low quality), then considers five downgrading factors: risk of bias (randomization, blinding, adequacy of follow-up), inconsistency (conflicting results across studies), indirectness (whether population, intervention, outcomes align with the target), imprecision (wide confidence intervals, small sample sizes), and publication bias. Each serious problem leads to one level of downgrade.

    How to identify P-hacking and selective reporting in studies?

    Check whether all pre-specified outcomes are reported (compare the study registry protocol with the published paper), whether multiple analyses were performed but only significant results reported, whether hypotheses were modified after seeing the results (HARKing), whether excessive subgroup analyses were conducted without multiple-comparison correction, and whether p-values are suspiciously clustered around 0.05.

    Can results from small-sample studies be trusted?

    Even if statistically significant, small-sample studies should be treated cautiously: effect sizes may be exaggerated, confidence intervals are often wide, and studies may be severely underpowered. Pay attention to whether an a priori power analysis was conducted, whether the effect size is practically meaningful, whether other studies replicate the finding, and whether results are overinterpreted. A single small-sample study typically provides only low-quality evidence.