Prompt Details
Model
(claude-4-5-sonnet)
Token size
914
Example input
[QUERY]: What is our refund policy for digital products?
[RETRIEVED CHUNKS]: Chunk 1: "All physical goods may be returned within 30 days of purchase with original receipt. Refunds are processed within 5-7 business days." Chunk 2: "Digital products include software licenses, downloadable templates, and online courses. These are non-refundable once accessed." Chunk 3: "For enterprise customers, custom refund terms apply as per the signed agreement."
[MODEL RESPONSE]: Our refund policy allows returns within 30 days for all products, including digital ones. Simply contact support with your order number.
[SYSTEM PROMPT]: You are a helpful customer support assistant. Answer the user's question.
Example output
RAG Diagnostic Report
Query: What is our refund policy for digital products?
Overall health: CRITICAL
1. Context Utilisation
FAILED
The model produced an answer that directly contradicts Chunk 2, which states digital products are non-refundable once accessed. The response ("returns within 30 days for all products, including digital ones") mirrors the physical goods policy from Chunk 1 but incorrectly generalises it to digital products. This is a textbook parametric memory override: the model answered from its training data pattern of "30-day return policies" rather than from the retrieved context.
2. Retrieval Quality
GOOD
All three chunks are highly relevant to the query. Chunk 2 contains the exact answer needed. Retrieval is not the problem here.
3. Hallucination Triggers
HIGH
The claim that digital products qualify for 30-day returns is directly refuted by Chunk 2.
π© [HALLUCINATION] "returns within 30 days for all products, including digital ones" β directly contradicts Chunk 2 which states digital products are non-refundable once accessed.
π© [HALLUCINATION] "Simply contact support with your order number" β no support contact process appears in any retrieved chunk. This detail was generated from training data.
4. Prompt Structure
CRITICAL ISSUE
The system prompt ("You are a helpful customer support assistant. Answer the user's question.") contains no grounding instruction. There is no directive telling the model to use the provided context, restrict its answer to retrieved information, or avoid drawing on general knowledge. Without this, the model treats the context as optional reference material rather than the authoritative source.
5. Chunk Quality
GOOD
Chunks are appropriately sized, clearly delimited, and contain the relevant policy text. Chunk boundaries are clean. No issues here.
Fix Plan β Ranked by Impact
Fix #1 β Add a grounding instruction to the system prompt [Effort: 30 minutes]
Problem: System prompt does not instruct the model to ground its answer in the provided context.
Why it causes failure: The model defaults to parametric memory when no explicit grounding constraint exists. Helpfulness pressure fills the gap.
Exact fix: Replace system prompt with: "You are a customer support assistant. Answer the user's question using ONLY the information in the context sections below. If the context does not contain the answer, say: 'I don't have that information β please contact support directly.' Do not use general knowledge or assumptions."
Fix #2 β Add explicit source boundary markers [Effort: 1 hour]
Problem: Chunks are not labelled as authoritative sources in the prompt structure.
Why it causes failure: Without explicit source labels, the model blends chunk content with training data when the two conflict.
Exact fix: Label each chunk as [SOURCE 1], [SOURCE 2], etc. Add before the query: "The following sources are the only information you may use to answer. Do not reference anything outside these sources."
Fix #3 β Add a post-generation faithfulness check [Effort: 2β4 hours]
Problem: No verification step exists after generation.
Why it causes failure: Even with a corrected prompt, LLMs occasionally override context on sensitive factual questions involving policies or numbers.
Exact fix: After generation, run a secondary check: "Does this response contradict or go beyond the provided sources? Answer YES or NO." If YES, regenerate with a stronger grounding instruction.
By purchasing this prompt, you agree to our terms of service
CLAUDE-4-5-SONNET
This prompt acts as a senior AI engineer who diagnoses your retrieval-augmented generation pipeline. Paste in your user query, retrieved chunks, and the model's final response and get back a structured fault report covering: context utilisation failures, hallucination triggers, retrieval mismatches, prompt injection risks, and a prioritised fix plan with exact prompt rewrites. Works with any RAG setup: custom pipelines, vector DBs. Saves at least 4 hours of manual debugging per session.
...more
Added 2 days ago
