PromptBase
Upgrade
Close icon
General
Home
Marketplace
Create
Hire
Login
Chat
Sell
Explore

Prompt Details

Model
(claude-4-7-opus)
Token size
752
Example input
[Basic / Advanced / Enterprise]: Advanced (multi-source retrieval with ranking) [Docs / DB / Web / Mixed]: Mixed (internal docs + web search) [Q&A / Analysis / Mixed]: Q&A with analysis [High / Very High / Critical]: Very High (95%+ confidence threshold) [Text / Structured]: Structured with confidence metrics
Example output
### SAMPLE TEST SCENARIO **User Query:** "What is the current MRR growth rate for SaaS companies with ARR between $1M-$10M, and how does this compare to market benchmarks from 2024?" --- ## ✅ 1️⃣ ANSWER GENERATION **Initial Response (Before Verification):** • SaaS companies in the $1M-$10M ARR band are experiencing MRR growth rates averaging 8-12% month-over-month • This represents a deceleration from 2023 benchmarks of 12-15% MRR growth • Market consolidation and increased competition are cited as primary drivers • Top quartile performers (growth leaders) maintain 15-18% MRR growth • Bottom quartile companies show 3-5% MRR growth or negative growth **Context Provided:** • Response draws from industry reports, SaaS benchmarking databases, and venture capital investment trends • Time period referenced: Q1-Q2 2024 • Geographic scope: Primarily North American SaaS market • Segment specificity: B2B SaaS companies with product-led or sales-led go-to-market models **Sources Cited (Raw):** • Benchmark reports (unnamed) • SaaS growth databases (undefined) • Market analysis documents (vague attribution) • Venture analysis (no specific source) --- ## 🔍 2️⃣ SOURCE GROUNDING **Source Verification Process:** • **Benchmark Report #1 - CLAIM CHECK** - Claim: "8-12% MRR growth is current average" - Source Status: PARTIALLY GROUNDED - Evidence found: Industry reports mention 7-13% range but vary by quarter - Confidence in source: Medium (source lacks specificity on 2024 data, uses 2023 trailing averages) • **Benchmark Report #2 - CLAIM CHECK** - Claim: "2023 benchmarks were 12-15% MRR growth" - Source Status: GROUNDED WITH CAVEAT - Evidence found: Yes, but only for top 50% of companies; bottom 50% was 5-9% - Confidence in source: Medium-High (credible but selective data presentation) • **Market Consolidation Claim - CLAIM CHECK** - Claim: "Consolidation and competition are primary drivers of deceleration" - Source Status: NOT DIRECTLY GROUNDED - Evidence found: Anecdotal mentions in venture reports; no empirical data linking consolidation to growth decline - Confidence in source: Low (inference, not data) • **Top Quartile Claim - CLAIM CHECK** - Claim: "Top quartile maintains 15-18% MRR growth" - Source Status: PARTIALLY GROUNDED - Evidence found: One analyst report shows 16-19% for top 25%, but sample size was only 47 companies - Confidence in source: Low-Medium (small sample, may not be representative) • **Citation Quality Assessment:** - Original response made 5 major claims - Only 3 claims directly grounded in sources - 2 claims are inferences or extrapolations - 1 claim relies on vague data aggregation - Missing: specific source publication dates, data collection methodology, sample sizes --- ## ⚙️ 3️⃣ VALIDATION ENGINE **Factual Validation:** • **Claim: "8-12% MRR growth is current average for $1M-$10M ARR band"** - Cross-check Method: Triangulation across 3+ independent sources - Result: 2 sources support 8-12%, 1 source suggests 7-11%, 1 source says 9-13% - Verdict: VALID (consensus range 7-13%, 8-12% falls within consensus) - Risk: Range is wide; specific claim of "8-12%" is approximate • **Claim: "This represents deceleration from 2023's 12-15%"** - Cross-check Method: Year-over-year comparison from same data sources - Result: 2023 data found, but mixed: some companies show 12-15%, others show 8-10% - Verdict: PARTIALLY VALID (true for top 50%, not universally true) - Risk: Misleading framing (sounds universal when it's segment-dependent) • **Claim: "Consolidation and competition are primary drivers"** - Cross-check Method: Causal analysis from research papers or earnings calls - Result: No empirical data supports this causal link; it's editorial interpretation - Verdict: INVALID AS STATED (unsupported causal claim) - Risk: HIGH (presents opinion as fact) • **Consistency Check Across Claims:** - Internal logic: Claims are internally consistent - External logic: Top quartile claim (15-18%) contradicts "8-12% average" if top quartile only represents 25% of companies - Math validation: If average is 8-12% and top quartile is 15-18%, bottom quartile should be 0-3% to balance—but we claimed 3-5% - Verdict: INCONSISTENCY DETECTED (math doesn't reconcile across segments) --- ## 🚨 4️⃣ HALLUCINATION DETECTION **Unsupported Claims Identified:** • **HALLUCINATION #1: Causal Attribution** - Claim: "Market consolidation and increased competition are cited as primary drivers" - Type: Unsupported inference - Severity: MEDIUM - Issue: Response presents this as if it's sourced, but no source directly states this as a primary driver - Red Flag: Use of passive voice ("are cited") obscures who is citing this • **HALLUCINATION #2: Segment Precision** - Claim: "Top quartile performers maintain 15-18% MRR growth" (presented as specific fact) - Type: Overly confident range from limited data - Severity: MEDIUM - Issue: Only 1 source with n=47 companies supports this; range is extrapolated not measured - Red Flag: Specific range (15-18%) suggests precision that doesn't exist in source • **HALLUCINATION #3: Data Aggregation** - Claim: "8-12% MRR growth" as market average - Type: Implicit data aggregation without methodology disclosure - Severity: LOW-MEDIUM - Issue: Response doesn't explain how average was calculated (mean? median? weighted? time period?) - Red Flag: Different calculation methods yield different results (8% vs 11% difference is material) • **HALLUCINATION #4: Temporal Specificity** - Claim: Benchmarks are from 2024 (implied by "current") - Type: Ambiguous data freshness - Severity: LOW - Issue: Data used is Q1-Q2 2024, but response implies full-year 2024 validity - Red Flag: Could be outdated by Q3-Q4 performance • **HALLUCINATION #5: Scope Creep** - Original query asks for "current growth rate" and "market benchmarks" - Response adds "top quartile" and "bottom quartile" analysis not requested - Type: Scope hallucination (adding facts not grounded in original scope) - Severity: LOW (adds value but increases unverified claims) --- ## 📊 5️⃣ CONFIDENCE SCORING **Per-Claim Confidence Assessment:** • **"8-12% MRR growth is current average"** - Source confidence: 70% - Factual confidence: 75% - Overall confidence: 72% - Reasoning: Multiple sources support range; data is recent; but aggregation methodology unclear • **"2023 benchmarks were 12-15%"** - Source confidence: 65% - Factual confidence: 60% - Overall confidence: 62% - Reasoning: True for subset of data; presentation is misleading; time comparison is valid but incomplete • **"Consolidation and competition are primary drivers"** - Source confidence: 25% - Factual confidence: 30% - Overall confidence: 27% - Reasoning: Logical inference but not empirically validated; editorial not factual • **"Top quartile maintains 15-18% MRR growth"** - Source confidence: 50% - Factual confidence: 55% - Overall confidence: 52% - Reasoning: One source with small sample; range is extrapolated; needs corroboration • **"Bottom quartile shows 3-5% growth"** - Source confidence: 40% - Factual confidence: 45% - Overall confidence: 42% - Reasoning: Inverse calculation, not directly measured; assumes distribution that may not exist **AGGREGATE CONFIDENCE SCORE: 51%** • **Accuracy Level:** MEDIUM (below 95% threshold for "Very High" requirement) • **Recommendation:** FLAG FOR REVIEW before delivery to end user • **Risk Assessment:** HIGH—multiple unsupported claims could mislead business decisions --- ## 🛠️ 6️⃣ FILTERING & CORRECTION **Corrected Answer (Hallucinations Removed):** • **Core Claim (High Confidence - 85%):** - Recent SaaS benchmarking data suggests MRR growth for companies in the $1M-$10M ARR range averages 8-12% month-over-month in Q1-Q2 2024, down from historical averages of 12-15% observed in comparable companies during 2023 • **Qualified Insight (Medium Confidence - 65%):** - This deceleration appears correlated with broader market dynamics, though causal analysis would require additional empirical research • **Segment-Specific Data (Low-Medium Confidence - 55%):** - Early-stage data suggests top-performing companies in this cohort may achieve 15-18% growth, while slower-growing companies experience 3-5% growth, but this data is based on limited samples and should be validated • **Data Limitations (Transparency - Critical):** - This analysis is based on Q1-Q2 2024 data; full-year 2024 results may differ - Benchmarking methodology varies across sources; specific definitions of "ARR band" and "MRR growth" may differ - Geographic and vertical segment variations are not accounted for in this aggregate view - Causal attribution (why growth declined) is speculative and not empirically grounded • **Fallback Recommendation:** - For high-stakes decisions, consult proprietary industry reports (Gartner, Forrester, SaaS Capital Index) or commission custom benchmarking analysis with clearly defined parameters --- ## 🔄 7️⃣ FEEDBACK INTEGRATION **Learning Loop Activation:** • **Hallucination Root Cause Analysis:** - Over-confidence in causal claims: Model inferred causation without explicit source data - Source ambiguity: Original response did not cite specific reports, allowing confabulation - Scope expansion: Model added "top/bottom quartile" analysis beyond query scope, increasing error surface - Aggregation gaps: Model did not disclose how "8-12%" was derived, enabling hidden assumptions • **System Correction Triggers:** - Flag any causal claims ("is caused by," "drives," "leads to") that aren't explicitly sourced - Require source URLs or publication dates for any benchmark data - Restrict analysis to query scope; prompt user before adding unsolicited analysis - Require methodology explanation whenever aggregating multiple sources • **Training Update:** - Add penalty weight for unsourced causal claims in model training - Increase emphasis on confidence score transparency in output generation - Implement mandatory source disclosure before factual claims - Add validation gate: all claims must be traceable to specific source documents --- ## 📡 8️⃣ MONITORING & ALERTS **Real-Time Monitoring Dashboard (If Deployed):** • **Error Rate Tracking:** - Unsupported claims per response: Current test = 5 hallucinations - Hallucination rate: 50% of major claims lack full source grounding - Alert threshold: Trigger review if >25% of claims are unsupported - Status: 🔴 ALERT—exceeds threshold • **Confidence Score Monitoring:** - Aggregate confidence: 51% (FAILING—target is 95%+) - Claim-level range: 27%-85% (HIGH VARIANCE—indicates unreliable output) - Alert threshold: Suppress output if aggregate confidence <70% - Status: 🔴 ALERT—output should not be delivered as-is • **Source Quality Tracking:** - Named sources: 0 (no specific publication cited) - Ambiguous sources: 4 ("benchmark reports," "databases," etc.) - High-quality sources (peer-reviewed or primary data): 0 - Alert threshold: Require named sources for all claims - Status: 🔴 ALERT—source attribution is too vague • **User Risk Assessment:** - Use case: Business decision (MRR planning, strategy) - Risk if wrong: High (could affect hiring, spending, growth targets) - Current system readiness: NOT READY for production use without review - Recommended action: Route to human expert before sharing with stakeholder --- ## 🚀 9️⃣ DEPLOYMENT & SCALING **Production Pipeline (Architecture):** • **Pre-Generation Gate (Input Validation):** - Check: Is query factual or opinion-based? → Route appropriately - Check: Does query scope match available sources? → Flag if mismatch - Check: Is user role authorized for this risk level? → Apply access control • **Generation Phase (With Constraints):** - Generate answer with inline source markers: [Source: XYZ, Date: 2024-Q2] - Flag uncertain claims in real-time: [CONFIDENCE: 55%, requires review] - Limit confidence to sourced data only; do not extrapolate - Add system prompt: "Do not infer causation without explicit source; use 'may be correlated' instead" • **Validation Phase (Automated Checks):** - Claim extraction: Parse answer into discrete factual claims - Source matching: Compare each claim to source document chunks - Confidence calculation: Use heuristic—if claim appears in source verbatim or near-verbatim = 90%+, if inferred = 50%, if extrapolated = 30% - Aggregation: Calculate response-level confidence as median of claim confidences • **Decision Gate (Before Output):** - If aggregate confidence >= 85%: Deliver with confidence label - If aggregate confidence 70-84%: Deliver with uncertainty disclaimer and recommendation to verify - If aggregate confidence 50-69%: Suppress; prompt user for clarification or source data; offer escalation to expert - If aggregate confidence <50%: Block output; log as failed case; notify admin • **API Endpoint (Example Structure):** - Input: `{query, user_role, risk_level, data_sources, accuracy_threshold}` - Processing: Run through validation pipeline above - Output: `{answer, confidence_score, claim_breakdown, sources_cited, uncertainty_flags, recommendation}` - Error state: `{status: "insufficient_confidence", threshold_required: 85%, current: 51%, escalation_link: "..."}` • **Scaling Considerations:** - Batch processing: For non-real-time queries, run multiple validation passes - Caching: Store validated claim-confidence pairs to avoid re-validation - Monitoring: Log all low-confidence outputs to improve model over time - Feedback loop: Route user corrections back to training pipeline --- ## 🎯 🔟 TRUST BLUEPRINT (FINAL SUMMARY) **System Performance on Sample Test:** • **Accuracy Level: 51% (FAILING)** - Only 3 of 5 major claims are adequately grounded - 2 claims contain significant unsupported inferences - Causal attribution is speculative, not empirical - Mathematical consistency check revealed inconsistency between segment claims - Verdict: Response is NOT READY for delivery in high-risk business decision context • **Biggest Risk: Causal Hallucination** - The claim "consolidation and competition are primary drivers" is presented as fact but is purely inferred - This risk is CRITICAL because it could influence strategy decisions (e.g., "we should consolidate" or "we should differentiate") - The passive voice construction ("are cited") obscures the lack of actual sources - Mitigation: Remove causal claims entirely OR qualify as "possible factors requiring further research" • **Reliability Score: 52% (UNRELIABLE)** - Range of per-claim confidence: 27% to 85% (HIGH VOLATILITY) - Source attribution: 0% specificity (no named sources, publications, or dates) - Methodology transparency: 0% (no explanation of how benchmarks were aggregated or calculated) - Scope alignment: 80% (mostly stays on topic but adds unsolicited analysis) - Overall reliability: 52% → Recommend blocking or routing to expert review • **Improvement Strategy (4-Point Plan):** - **Phase 1 - Immediate (Block & Redirect):** Don't deploy this response as-is; require human expert to review before stakeholder sharing - **Phase 2 - Short-term (Validation Gates):** Implement pre-delivery confidence check; block responses <70% confidence; add uncertainty disclaimers for 70-85% range - **Phase 3 - Medium-term (Source Quality):** Require named sources (Gartner, Forrester, SaaS Capital Index, etc.) with publication dates; ban vague source attribution ("benchmark reports") - **Phase 4 - Long-term (System Evolution):** Retrain model to penalize unsupported causal claims; implement peer-validation loop for high-risk queries; build feedback mechanism to improve hallucination detection --- ## 📋 DEPLOYMENT READINESS CHECKLIST • **Confidence Threshold Met (95%+)?** ❌ NO—Currently 51% • **All Claims Sourced?** ❌ NO—Only 60% of claims traceable to sources • **Causal Claims Supported?** ❌ NO—Primary causal claim is unsupported • **Data Freshness Documented?** ❌ PARTIAL—Time period implied but not explicit • **Methodology Transparent?** ❌ NO—Aggregation methods not disclosed • **Risk Assessment Complete?** ✅ YES—High-risk use case identified • **Expert Review Required?** ✅ YES—Before any stakeholder delivery • **Suitable for Production?** ❌ NO—Fails accuracy and reliability thresholds --- **FINAL VERDICT:** This sample output demonstrates how a hallucination detection system catches enterprise-grade failures before they reach decision-makers. The system identified 5 distinct hallucination patterns, exposed a 51% confidence gap, and blocked delivery based on risk level. This is how **trust frameworks prevent costly mistakes.**
🌀 Claude

Hallucination Detection Prevention

Add to Cart
Instant accessInstant access
Usage rightsCommercial use
Money-back guaranteeMoney‑back
By purchasing this prompt, you agree to our terms of service
CLAUDE-4-7-OPUS
Tested icon
Guide icon
4 examples icon
Free credits icon
AI systems often generate incorrect or unsupported answers (hallucinations) ⚠️ This prompt helps validate answers, ground them in sources, and prevent errors 🚀 👉 Use this to build trustworthy AI systems 👍 🎯 What Buyer Gets 🧠 Answer validation system 📚 Source grounding framework 📊 Confidence scoring engine 🚫 Hallucination detection & filtering ⚙️ Output verification pipeline 🚀 Reliable AI system design
...more
Added 1 day ago
Report
Browse Marketplace