Prompt Details
Model
(claude-4-6-sonnet)
Token size
2,709
Example input
[PAPER_DESCRIPTION]: This empirical paper evaluates "CBT-E" (Enhanced Cognitive Behavioral Therapy) for treating severe generalized anxiety disorder (GAD). The main claim is that CBT-E results in a 30% greater reduction in anxiety symptoms compared to standard CBT. The researchers conducted a randomized controlled trial (RCT) with 200 adults diagnosed with GAD. Participants were split into two groups (CBT-E vs. Standard CBT) for 12 weeks. Outcomes were measured using the GAD-7 assessment tool at baseline, week 12, and a 6-month follow-up. The authors report a p-value of <0.01 for the primary outcome and claim the results demonstrate long-term durability. However, the dropout rate was 15% in the CBT-E group compared to 5% in the standard group, which was not heavily addressed.
[READER_GOAL]: Should I recommend this specific therapy modification in our clinic's new standard operating procedures for treating GAD?
[DOMAIN]: Clinical Psychology / Psychiatry
Example output
### PAPER BRIEF
```
PAPER TYPE: Empirical — Randomized Controlled Trial (RCT)
DOMAIN: Clinical Psychology / Psychiatry
PRIMARY CLAIM: CBT-E produces 30% greater reduction in anxiety symptoms
compared to standard CBT in adults with GAD
METHOD USED: RCT, two-arm parallel group, 12-week intervention
SAMPLE / DATA: 200 adults with GAD diagnosis; GAD-7 at baseline,
week 12, and 6-month follow-up
READER GOAL: Should CBT-E be recommended in clinic SOPs for GAD treatment?
```
---
### CLAIM INVENTORY
```
── CLAIM 01 ──────────────────────────────────────────────────────────────
TYPE: PRIMARY
CLAIM: CBT-E produces 30% greater symptom reduction than standard CBT
METHOD: RCT with GAD-7 measurement at baseline and week 12
THRESHOLD: CAUSAL — requires RCT with adequate power, blinding, ITT
analysis, and controlled confounders
VERDICT: WEAK FIT
NOTE: RCT design is appropriate for causal claims, but the
differential dropout (15% vs 5%) raises serious concern
about whether the 30% figure reflects treatment efficacy
or attrition bias. Handling of missing data not reported.
──────────────────────────────────────────────────────────────────────────
── CLAIM 02 ──────────────────────────────────────────────────────────────
TYPE: SUPPORTING
CLAIM: Results are statistically significant (p < 0.01)
METHOD: Inferential statistics on GAD-7 scores
THRESHOLD: STATISTICAL ADEQUACY — p-value alone is insufficient;
effect size and confidence intervals are needed
VERDICT: WEAK FIT
NOTE: p < 0.01 is reported but no effect size (Cohen's d,
partial η²) or confidence interval is mentioned. With
n=200, the study may be adequately powered, but the
clinical magnitude of the 30% difference cannot be
independently evaluated from p alone.
──────────────────────────────────────────────────────────────────────────
── CLAIM 03 ──────────────────────────────────────────────────────────────
TYPE: SUPPORTING
CLAIM: Treatment effects demonstrate long-term durability
METHOD: 6-month follow-up GAD-7 assessment
THRESHOLD: LONGITUDINAL — requires retention rates, ITT or
per-protocol specification, and attrition accounting
VERDICT: WEAK FIT
NOTE: If the CBT-E group already had 15% dropout by week 12,
the 6-month follow-up sample is likely further reduced
and potentially unrepresentative. Durability claim rests
on a likely-biased subsample. This is the paper's most
vulnerable claim.
──────────────────────────────────────────────────────────────────────────
── CLAIM 04 ──────────────────────────────────────────────────────────────
TYPE: BACKGROUND
CLAIM: GAD-7 is an appropriate outcome measure for GAD
METHOD: Not tested — assumed by authors
THRESHOLD: CONSTRUCT VALIDITY — not under examination here
VERDICT: APPROPRIATE (GAD-7 is a validated, widely used tool;
this assumption is defensible)
──────────────────────────────────────────────────────────────────────────
```
---
### QUALITY SCORECARD
```
DIMENSION SCORE NOTE
──────────────────────────────────────────────────────────────────────────
Internal Validity WEAK Differential dropout (15% vs 5%)
is a serious threat to group
equivalence at outcome. Missing
data handling not reported. ITT
analysis not confirmed.
External Validity CANNOT ASSESS Sample characteristics (severity,
comorbidities, demographics) not
described. Generalisability to
your clinic population unknown.
Construct Validity ADEQUATE GAD-7 is a validated measure.
However, "30% greater reduction"
is not anchored to a minimal
clinically important difference
(MCID), so clinical vs statistical
significance is unresolved.
Statistical Adequacy WEAK p < 0.01 reported; no effect size,
no confidence intervals, no
power calculation mentioned. Cannot
assess clinical magnitude of the
finding independently.
Reproducibility WEAK Dropout handling undisclosed.
Therapist training/fidelity
protocols not described. No
mention of blinding procedures
or allocation concealment.
──────────────────────────────────────────────────────────────────────────
```
---
### FINDING CARDS
```
── FINDING 01 ────────────────────────────────────────────────────────────
TYPE: WEAK
SIGNAL: CBT-E outperforms standard CBT on GAD-7 at week 12
EVIDENCE: RCT design + p < 0.01 on primary outcome
CONFIDENCE: MEDIUM (55%) — RCT is appropriate design; p-value
supports a directional finding. Confidence capped at
MEDIUM because: (a) differential dropout threatens
group comparability, (b) no effect size reported,
(c) missing data handling unknown.
IMPACT: The direction of the finding (CBT-E > standard CBT) is
plausible and worth noting, but the 30% magnitude
figure cannot be trusted without attrition correction.
──────────────────────────────────────────────────────────────────────────
── FINDING 02 ────────────────────────────────────────────────────────────
TYPE: CONTESTED
SIGNAL: Long-term durability of CBT-E effects at 6 months
EVIDENCE: 6-month follow-up GAD-7 assessment reported
CONFIDENCE: LOW (30%) — Follow-up data almost certainly drawn from
a self-selected subsample (completers only, after 15%
CBT-E dropout). No retention figures at 6 months
provided. Survivorship bias is highly probable.
IMPACT: The durability claim is the weakest in the paper.
Do not use it to justify long-term SOP commitments
without independent corroboration.
──────────────────────────────────────────────────────────────────────────
── FINDING 03 ────────────────────────────────────────────────────────────
TYPE: GAP
SIGNAL: Why did CBT-E have 3× the dropout rate of standard CBT?
EVIDENCE: Dropout noted (15% vs 5%) but "not heavily addressed"
per paper description
CONFIDENCE: UNKNOWN — Cause of differential dropout not reported
IMPACT: This is clinically critical for SOP decisions. Higher
dropout may indicate patient burden, side-effect-
equivalent distress, or poor fit for certain profiles.
Recommending CBT-E without understanding why patients
disengage at 3× the rate introduces real clinical risk.
──────────────────────────────────────────────────────────────────────────
── FINDING 04 ────────────────────────────────────────────────────────────
TYPE: GAP
SIGNAL: Clinical significance of "30% greater reduction"
not established
EVIDENCE: No MCID anchor, no effect size, no CI reported
CONFIDENCE: UNKNOWN
IMPACT: A statistically significant 30% improvement on GAD-7
could range from barely perceptible to transformative
in practice. Cannot determine whether this crosses the
threshold for routine clinical implementation.
──────────────────────────────────────────────────────────────────────────
── FINDING 05 ────────────────────────────────────────────────────────────
TYPE: GAP
SIGNAL: Therapist competency and treatment fidelity protocols
not described
EVIDENCE: Not mentioned in paper description
CONFIDENCE: UNKNOWN — UNDISCLOSED gap
IMPACT: CBT-E results in an RCT setting depend heavily on
therapist training. Replicating outcomes in a routine
clinic without fidelity standards is uncertain.
──────────────────────────────────────────────────────────────────────────
```
---
### EVIDENCE INVENTORY
```
EXISTS:
→ RCT design (appropriate for causal inference)
→ Validated outcome measure (GAD-7)
→ Adequate sample size (n=200)
→ Statistical significance reported (p < 0.01)
→ Two measurement points post-baseline (week 12 + 6-month follow-up)
→ Dropout rates disclosed (partial credit — rates given, causes not)
MISSING:
→ Effect size and confidence intervals [UNDISCLOSED]
→ Missing data / ITT analysis specification [UNDISCLOSED]
→ Reasons for differential dropout [UNDISCLOSED]
→ 6-month follow-up retention figures [UNDISCLOSED]
→ Sample demographic and clinical characteristics [UNDISCLOSED]
→ Therapist training and fidelity monitoring [UNDISCLOSED]
→ Blinding and allocation concealment procedures [UNDISCLOSED]
→ MCID anchor for GAD-7 change scores [ADDRESSABLE]
→ Adverse effects or patient burden data [UNDISCLOSED]
GAP CLASSIFICATION SUMMARY:
ADDRESSABLE: 1 (MCID comparison — retrievable from literature)
STRUCTURAL: 0
UNDISCLOSED: 7 (all other gaps above — absent from this paper)
──────────────────────────────────────────────────────────────────────────
```
---
### READER VERDICT
```
GOAL: Should CBT-E be recommended in clinic SOPs for GAD treatment?
VERDICT: NO — not on the basis of this paper alone
REASONING: The RCT design gives this paper a credible foundation,
and the directional finding (CBT-E outperforms standard
CBT) is worth taking seriously. However, the 3× higher
dropout in the CBT-E arm — left unexplained — is a
direct patient safety and feasibility concern for SOP
adoption. The durability claim rests on a likely biased
follow-up sample. Seven undisclosed data points prevent
independent verification of the primary finding. SOPs
affect all eligible patients; a single trial with these
gaps does not meet the evidentiary threshold for that
scope of implementation.
CONDITION: Verdict would shift to CONDITIONAL YES if:
(1) Authors provide ITT analysis with dropout imputation
(2) Differential dropout causes are explained and
clinically manageable
(3) Effect size exceeds MCID for GAD-7 (typically
≥5 points or ~30% score reduction anchored to
functional outcomes)
(4) At least one independent replication exists
(5) Therapist fidelity requirements are specified
and feasible in your clinic context
──────────────────────────────────────────────────────────────────────────
```
---
### REPRODUCIBILITY FLAG
```
REPRODUCIBILITY RISK: HIGH
KEY THREATS:
→ Differential dropout (15% vs 5%) with no imputation or
ITT analysis reported — results may not survive reanalysis
→ Therapist training protocols undisclosed — the most common
reason RCT psychotherapy findings fail to replicate in
community settings
→ Blinding and allocation concealment not described — risk
of selection and performance bias unknown
→ 6-month follow-up sample size and composition not reported —
durability finding cannot be independently evaluated
→ No effect size reported — magnitude of benefit unverifiable
WHAT WOULD HELP:
→ Full CONSORT-compliant trial report (flow diagram, ITT table)
→ Sensitivity analysis comparing completer-only vs ITT results
→ Therapist training manual or fidelity checklist
→ Dropout interview data or stated reasons for discontinuation
→ Pre-registration record (e.g. ClinicalTrials.gov) to verify
outcome measures were not changed post-hoc
──────────────────────────────────────────────────────────────────────────
```
---
**A practical note for your context:** Before any SOP decision, it would be worth checking whether CBT-E has independent replication trials and whether a CONSORT-compliant version of this paper (or its full publication) addresses the gaps flagged above. If you can share the full methods section, the confidence scores on Finding 01 and 02 could shift substantially in either direction.
By purchasing this prompt, you agree to our terms of service
CLAUDE-4-6-SONNET
Single Paper Analyst™ runs the C.R.I.T.I.Q.U.E.™ protocol on any paper you feed it: extracting every claim, auditing the methodology against the claim type, scoring five quality dimensions, and assigning a confidence level to each finding — with explicit basis.
...more
Added 11 hours ago
