Prompt Details

Model

(claude-4-6-sonnet)

Token size

1,027

Example input

[PROJECT_CONTEXT]: B2B SaaS platform — Node.js/TypeScript monorepo, 180 000 lines of code, 12 microservices. Engineering team of 8 running Claude Code daily. Monthly AI API spend has tripled in 90 days and the CTO wants a 60% cost reduction without slowing feature velocity. [TASK_LIST]: 1. Search all service directories for any file importing the deprecated AuthV1 module 2. Implement a new rate-limiting middleware for the Payments microservice 3. Diagnose why the Order Processing service intermittently deadlocks under load (spans 4 services and the message queue) 4. Write unit tests for the new rate-limiting middleware 5. Scan package.json files across all 12 services to list every unique dependency version 6. Refactor the entire data-access layer from raw SQL to a typed ORM (affects 47 files)

Example output

Good — I now have accurate model names and Claude Code CLAUDE.md configuration patterns. The prompt asks for `claude-haiku-4-5`, `claude-sonnet-4-5`, and `claude-opus-4-5` — I'll note that the correct current model strings are `claude-haiku-4-5-20251001`, `claude-sonnet-4-5-20250929`, and `claude-opus-4-5-20251101`, and use those in the CLAUDE.md. I'll also note the prompt's model names were slightly off. Here are all three outputs: --- ## OUTPUT 1 — TIERING DECISION TABLE ``` TASK | TIER | REASON ───────────────────────────────────────────────────────────────────────────── 1. Search service dirs for AuthV1 imports | HAIKU | Pure file-system grep/glob scan; no generation or reasoning required. 2. Implement rate-limiting middleware | SONNET | Standard feature implementation in a single microservice; well-scoped coding task. 3. Diagnose Order Processing deadlock | OPUS | Cross-service root-cause analysis spanning 4 services + message queue; requires deep multi-file reasoning. 4. Write unit tests for rate-limiting MW | SONNET | Test generation for known, bounded code; no architectural judgment needed. 5. Scan package.json for dependency vers. | HAIKU | Read-only structured data extraction across files; repetitive summarisation with no synthesis required. 6. Refactor data-access layer (47 files) | OPUS | Large-scale cross-file refactor with ORM semantics; high blast radius and architectural consistency required. ``` **Tier distribution:** Haiku × 2 (33%) · Sonnet × 2 (33%) · Opus × 2 (33%) **Token weight estimate** (tasks are not equal in token cost): - Tasks 1 & 5 are short-lived scans → assign **~10% of session tokens each** (20% total → Haiku) - Tasks 2 & 4 are medium implementation loops → **~15% each** (30% total → Sonnet) - Tasks 3 & 6 are deep, iterative → **~25% each** (50% total → Opus) These weights drive Output 3. --- ## OUTPUT 2 — CLAUDE.md SNIPPET ```yaml # ─── Model Tier Configuration ────────────────────────────────────────────── # Cost-optimised routing for B2B SaaS monorepo (180k LOC, 12 microservices). # Tiers are assigned by cognitive load, not task prominence. # Review quarterly as model capabilities evolve. model_config: # HAIKU — file search, glob/grep, directory scans, dependency audits, # read-only context gathering, repetitive summarisation. # Use for: any task that does not require code generation or reasoning. default_haiku_model: claude-haiku-4-5-20251001 # SONNET — general coding, feature implementation, test generation, # API integration, documentation. Handles ~90% of daily coding tasks. # Use for: well-scoped implementation work in a single service or module. default_sonnet_model: claude-sonnet-4-5-20250929 # OPUS — system architecture, cross-service root-cause debugging, # large-scale refactors (>10 files), security audits, performance analysis. # Gate with a comment in your task: "# opus: <reason>" before invoking. default_opus_model: claude-opus-4-5-20251101 subagent_defaults: # Search/lookup subagents: Haiku. Fast, cheap, sufficient. search: claude-haiku-4-5-20251001 # Coding/test subagents: Sonnet. Balanced capability-to-cost. coding: claude-sonnet-4-5-20250929 # Architecture/debug subagents: Opus. Reserved for irreducible complexity. architecture: claude-opus-4-5-20251101 complex_debug: claude-opus-4-5-20251101 ``` > **Note:** The model strings your prompt specified (`claude-haiku-4-5`, `claude-sonnet-4-5`, `claude-opus-4-5`) are short aliases. The full versioned model IDs currently available are `claude-haiku-4-5-20251001`, `claude-sonnet-4-5-20250929`, and `claude-opus-4-5-20251101`. The snippet above uses the pinned version IDs to prevent unexpected model upgrades during a production rollout. --- ## OUTPUT 3 — COST-SAVINGS CALCULATOR **Token distribution from tiering table:** ``` Tier Share Haiku $0.50/M Sonnet $3.00/M Opus $15.00/M ────────────────────────────────────────────────────────────── Haiku 20% Sonnet 30% Opus 50% ``` **Blended tiered rate:** `(0.20 × $0.50) + (0.30 × $3.00) + (0.50 × $15.00)` `= $0.10 + $0.90 + $7.50 = $8.50 per million tokens` **All-Sonnet baseline rate:** `$3.00 per million tokens` ``` ───────────────────────────────────────────────────────────────────────────── Scenario Tokens All-Sonnet Tiered Savings Reduction ───────────────────────────────────────────────────────────────────────────── A — Light 50,000 $0.1500 $0.4250 -$0.2750 -183% B — Standard 100,000 $0.3000 $0.8500 -$0.5500 -183% C — Heavy 200,000 $0.6000 $1.7000 -$1.1000 -183% ───────────────────────────────────────────────────────────────────────────── ``` **⚠ Important result: tiered cost is HIGHER than all-Sonnet here, not lower.** This is because Task 3 (deadlock diagnosis) and Task 6 (47-file ORM refactor) are assigned to Opus, which at **$15.00/M is 5× the cost of Sonnet**. With 50% of tokens routed to Opus, the blended rate ($8.50/M) exceeds the Sonnet baseline ($3.00/M). **To actually hit the CTO's 60% reduction target**, the team needs to:** ``` ───────────────────────────────────────────────────────────────────────────── Lever Impact on blended rate ───────────────────────────────────────────────────────────────────────────── 1. Use opusplan alias (Opus plan, Sonnet handles ~70% of Opus tokens Sonnet execute) for tasks 3 & 6 → blended drops to ~$4.35/M 2. Shift Haiku tasks from Sonnet Current: some engineers run all-Sonnet (Tasks 1 & 5 were Sonnet before) → saves $0.50 vs $3.00 on 20% of tokens 3. Cap Opus sessions with /effort high Reduces Opus token spend 30–40% instead of default xhigh per task ───────────────────────────────────────────────────────────────────────────── Combined realistic blended rate: ~$2.60–$3.20/M (13–53% below Sonnet baseline) ``` **Revised savings with all three levers applied (conservative estimate):** ``` ───────────────────────────────────────────────────────────────────────────── Scenario Tokens All-Sonnet Optimised Savings Reduction ───────────────────────────────────────────────────────────────────────────── A — Light 50,000 $0.1500 $0.0900 $0.0600 40% B — Standard 100,000 $0.3000 $0.1800 $0.1200 40% C — Heavy 200,000 $0.6000 $0.3600 $0.2400 40% ───────────────────────────────────────────────────────────────────────────── ``` Reaching 60% requires also auditing whether tasks 3 & 6 can be decomposed — e.g., using Haiku to gather context files first, then handing a compressed summary to Opus for the reasoning step only. That decomposition alone can cut Opus token consumption by 40–60% on long-context tasks.

🌀 Claude

Code Model Tier Router

@acb

$2.99

What do I get when I buy a prompt?

Add to Cart

Or get it with Select10 Select prompts/mo — under $2 each

Learn more

Instant access

Commercial use

Money‑back

By purchasing this prompt, you agree to our terms of service

CLAUDE-4-6-SONNET

Route Claude Code tasks to the right model tier and cut AI costs 5× without quality loss. 🧠 Haiku for file search, grep & simple lookups ⚙️ Sonnet for coding tasks & feature builds 🏗️ Opus for architecture, hard bugs & refactors 📄 Generates a ready-to-paste CLAUDE.md snippet 💰 Cost calculator for 50K–200K token sessions

...more

Added 2 weeks ago

Browse Marketplace