Prompt Details

Model

(claude-4-6-sonnet)

Token size

1,118

Example input

[USER_QUERY]: What is the difference between the Pro plan and the Enterprise plan? [RETRIEVED_CHUNKS]: Chunk 1: Pro Plan – Includes advanced analytics, priority email support, and up to 10 team members. Chunk 2: Enterprise Plan – Includes unlimited team members, dedicated account manager, SSO authentication, and custom SLAs. Chunk 3: Billing Policy – All plans are billed monthly unless annual billing is selected. [AGENT_RESPONSE]: The Pro plan includes analytics and email support for up to 10 users, while the Enterprise plan offers unlimited users, SSO authentication, and a dedicated account manager. Enterprise plans also include 24/7 phone support. [EXPECTED_BEHAVIOR]: The answer should compare plan features and avoid mentioning features not present in documentation.

Example output

**Overall: 41/50 — pipeline mostly healthy, one clean hallucination to fix.** The retrieval layer did its job well. Chunks 1 and 2 directly cover both sides of the comparison, and the model correctly ignored the billing chunk (Chunk 3) as irrelevant noise. The generated response is also well-structured and nearly complete. The single failure is unambiguous: **"Enterprise plans also include 24/7 phone support"** has no grounding in any retrieved chunk. The model reached into its parametric memory — likely pattern-matching against common enterprise SaaS feature sets — and injected a feature the documentation never mentions. This is exactly the failure mode RAG is supposed to prevent. **The fix is squarely in the generation layer:** 1. Tighten the system prompt with an explicit grounding constraint — something like *"only state facts present in the retrieved context; if unsupported, say so."* 2. Optionally layer in a post-generation NLI check that scores each response sentence against the context before it's served. The retrieval and context setup don't need changes. You could also tighten Chunk 3 out by raising your similarity threshold, but that's a minor quality-of-life improvement, not a correctness fix.

🌀 Claude

Rag Pipeline Auditors

Name: Rag Pipeline Auditors Claude Prompt
Brand: PromptBase
Price: 19.99 USD
Availability: InStock
Author: monna

@monna

$19.99

What do I get when I buy a prompt?

Add to Cart

Instant access

Commercial use

Money‑back

By purchasing this prompt, you agree to our terms of service

CLAUDE-4-6-SONNET

Retrieval-Augmented Generation (RAG) systems fail in subtle ways: irrelevant chunks, missing context, hallucinated answers, or poorly grounded responses. When something goes wrong, it’s often unclear whether the problem came from retrieval, context selection, or generation. This prompt acts as a structured RAG audit framework that analyzes a full interaction — query → retrieved chunks → generated answer — and evaluates the pipeline across five critical quality dimensions: Retrieval relevance

...more

Updated 1 week ago

Browse Marketplace