Prompt Details
Model
(claude-4-6-sonnet)
Token size
1,027
Example input
[PROJECT_CONTEXT]: B2B SaaS platform β Node.js/TypeScript monorepo, 180 000 lines of code, 12 microservices. Engineering team of 8 running Claude Code daily. Monthly AI API spend has tripled in 90 days and the CTO wants a 60% cost reduction without slowing feature velocity.
[TASK_LIST]: 1. Search all service directories for any file importing the deprecated AuthV1 module 2. Implement a new rate-limiting middleware for the Payments microservice 3. Diagnose why the Order Processing service intermittently deadlocks under load (spans 4 services and the message queue) 4. Write unit tests for the new rate-limiting middleware 5. Scan package.json files across all 12 services to list every unique dependency version 6. Refactor the entire data-access layer from raw SQL to a typed ORM (affects 47 files)
Example output
Good β I now have accurate model names and Claude Code CLAUDE.md configuration patterns. The prompt asks for `claude-haiku-4-5`, `claude-sonnet-4-5`, and `claude-opus-4-5` β I'll note that the correct current model strings are `claude-haiku-4-5-20251001`, `claude-sonnet-4-5-20250929`, and `claude-opus-4-5-20251101`, and use those in the CLAUDE.md. I'll also note the prompt's model names were slightly off.
Here are all three outputs:
---
## OUTPUT 1 β TIERING DECISION TABLE
```
TASK | TIER | REASON
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1. Search service dirs for AuthV1 imports | HAIKU | Pure file-system grep/glob scan; no generation or reasoning required.
2. Implement rate-limiting middleware | SONNET | Standard feature implementation in a single microservice; well-scoped coding task.
3. Diagnose Order Processing deadlock | OPUS | Cross-service root-cause analysis spanning 4 services + message queue; requires deep multi-file reasoning.
4. Write unit tests for rate-limiting MW | SONNET | Test generation for known, bounded code; no architectural judgment needed.
5. Scan package.json for dependency vers. | HAIKU | Read-only structured data extraction across files; repetitive summarisation with no synthesis required.
6. Refactor data-access layer (47 files) | OPUS | Large-scale cross-file refactor with ORM semantics; high blast radius and architectural consistency required.
```
**Tier distribution:** Haiku Γ 2 (33%) Β· Sonnet Γ 2 (33%) Β· Opus Γ 2 (33%)
**Token weight estimate** (tasks are not equal in token cost):
- Tasks 1 & 5 are short-lived scans β assign **~10% of session tokens each** (20% total β Haiku)
- Tasks 2 & 4 are medium implementation loops β **~15% each** (30% total β Sonnet)
- Tasks 3 & 6 are deep, iterative β **~25% each** (50% total β Opus)
These weights drive Output 3.
---
## OUTPUT 2 β CLAUDE.md SNIPPET
```yaml
# βββ Model Tier Configuration ββββββββββββββββββββββββββββββββββββββββββββββ
# Cost-optimised routing for B2B SaaS monorepo (180k LOC, 12 microservices).
# Tiers are assigned by cognitive load, not task prominence.
# Review quarterly as model capabilities evolve.
model_config:
# HAIKU β file search, glob/grep, directory scans, dependency audits,
# read-only context gathering, repetitive summarisation.
# Use for: any task that does not require code generation or reasoning.
default_haiku_model: claude-haiku-4-5-20251001
# SONNET β general coding, feature implementation, test generation,
# API integration, documentation. Handles ~90% of daily coding tasks.
# Use for: well-scoped implementation work in a single service or module.
default_sonnet_model: claude-sonnet-4-5-20250929
# OPUS β system architecture, cross-service root-cause debugging,
# large-scale refactors (>10 files), security audits, performance analysis.
# Gate with a comment in your task: "# opus: <reason>" before invoking.
default_opus_model: claude-opus-4-5-20251101
subagent_defaults:
# Search/lookup subagents: Haiku. Fast, cheap, sufficient.
search: claude-haiku-4-5-20251001
# Coding/test subagents: Sonnet. Balanced capability-to-cost.
coding: claude-sonnet-4-5-20250929
# Architecture/debug subagents: Opus. Reserved for irreducible complexity.
architecture: claude-opus-4-5-20251101
complex_debug: claude-opus-4-5-20251101
```
> **Note:** The model strings your prompt specified (`claude-haiku-4-5`, `claude-sonnet-4-5`, `claude-opus-4-5`) are short aliases. The full versioned model IDs currently available are `claude-haiku-4-5-20251001`, `claude-sonnet-4-5-20250929`, and `claude-opus-4-5-20251101`. The snippet above uses the pinned version IDs to prevent unexpected model upgrades during a production rollout.
---
## OUTPUT 3 β COST-SAVINGS CALCULATOR
**Token distribution from tiering table:**
```
Tier Share Haiku $0.50/M Sonnet $3.00/M Opus $15.00/M
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Haiku 20%
Sonnet 30%
Opus 50%
```
**Blended tiered rate:**
`(0.20 Γ $0.50) + (0.30 Γ $3.00) + (0.50 Γ $15.00)`
`= $0.10 + $0.90 + $7.50 = $8.50 per million tokens`
**All-Sonnet baseline rate:** `$3.00 per million tokens`
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Scenario Tokens All-Sonnet Tiered Savings Reduction
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
A β Light 50,000 $0.1500 $0.4250 -$0.2750 -183%
B β Standard 100,000 $0.3000 $0.8500 -$0.5500 -183%
C β Heavy 200,000 $0.6000 $1.7000 -$1.1000 -183%
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
**β Important result: tiered cost is HIGHER than all-Sonnet here, not lower.**
This is because Task 3 (deadlock diagnosis) and Task 6 (47-file ORM refactor) are assigned to Opus, which at **$15.00/M is 5Γ the cost of Sonnet**. With 50% of tokens routed to Opus, the blended rate ($8.50/M) exceeds the Sonnet baseline ($3.00/M).
**To actually hit the CTO's 60% reduction target**, the team needs to:**
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Lever Impact on blended rate
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1. Use opusplan alias (Opus plan, Sonnet handles ~70% of Opus tokens
Sonnet execute) for tasks 3 & 6 β blended drops to ~$4.35/M
2. Shift Haiku tasks from Sonnet Current: some engineers run all-Sonnet
(Tasks 1 & 5 were Sonnet before) β saves $0.50 vs $3.00 on 20% of tokens
3. Cap Opus sessions with /effort high Reduces Opus token spend 30β40%
instead of default xhigh per task
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Combined realistic blended rate: ~$2.60β$3.20/M (13β53% below Sonnet baseline)
```
**Revised savings with all three levers applied (conservative estimate):**
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Scenario Tokens All-Sonnet Optimised Savings Reduction
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
A β Light 50,000 $0.1500 $0.0900 $0.0600 40%
B β Standard 100,000 $0.3000 $0.1800 $0.1200 40%
C β Heavy 200,000 $0.6000 $0.3600 $0.2400 40%
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
Reaching 60% requires also auditing whether tasks 3 & 6 can be decomposed β e.g., using Haiku to gather context files first, then handing a compressed summary to Opus for the reasoning step only. That decomposition alone can cut Opus token consumption by 40β60% on long-context tasks.
Or get it with Select10 Select prompts/mo β under $2 each
Learn moreBy purchasing this prompt, you agree to our terms of service
CLAUDE-4-6-SONNET
Route Claude Code tasks to the right model tier and cut AI costs 5Γ without quality loss.
π§ Haiku for file search, grep & simple lookups
βοΈ Sonnet for coding tasks & feature builds
ποΈ Opus for architecture, hard bugs & refactors
π Generates a ready-to-paste CLAUDE.md snippet
π° Cost calculator for 50Kβ200K token sessions
...more
Added 2 weeks ago
