Skill Details
Runtime
(claude-4-6-sonnet)
Example user message
[userMessage]: QUERY_TEXT: SELECT * FROM customers c WHERE status = 'active' AND (SELECT SUM(total) FROM orders o WHERE o.customer_id = c.id) > 5000 AND email LIKE '%@yahoo.com'; SCHEMA_CONTEXT: customers (id INT PK, email VARCHAR, name VARCHAR, status VARCHAR); orders (id INT PK, customer_id INT, total DECIMAL). Indexes on customers(email), customers(status), orders(customer_id). DATABASE_ENGINE: PostgreSQL PERFORMANCE_COMPLAINT: High CPU usage and slow response time on the customer loyalty dashboard.
Example agent response
## SQL Query Self-Optimizer™ — v1.0.0
**Engine:** PostgreSQL | **Loop:** GEPA Evolution — 3-pass ceiling
---
## PHASE 1: DATA INPUT
**Intake Gate — Status:**
| Field | Status |
|---|---|
| QUERY_TEXT | ✓ Present |
| SCHEMA_CONTEXT | ✓ Present (tables + indexes confirmed) |
| DATABASE_ENGINE | ✓ PostgreSQL |
| PERFORMANCE_COMPLAINT | ✓ High CPU, slow dashboard response |
| EXECUTION_PLAN | ✗ Absent — analysis proceeds without plan data |
**Route:** FULL ANALYSIS — all 3 passes active.
**LOOP PRIMITIVES:**
```
EDITABLE_ASSET → query_config: the supplied QUERY_TEXT
SCALAR_METRIC → severity_density: HIGH + CRITICAL count per 100 lines
TIME_BOX → 3-pass ceiling — emit after Pass 3 regardless
```
---
## PHASE 2: PASS 1 — DISCOVERY
Pre-emit gate applied to each finding: every finding below is anchored to a specific clause or token in QUERY_TEXT.
Running all 10 scans in order:
**SCAN-01 — SELECT \*:** `SELECT *` present in outer query on `customers`. Table has 5 columns — not a wide-column case, so AP-MEDIUM. However, query is on a dashboard hot path (per PERFORMANCE_COMPLAINT), so **AP-HIGH** applies per the scan rule exception. → **FINDING #1**
**SCAN-02 — Correlated subquery:** `(SELECT SUM(total) FROM orders o WHERE o.customer_id = c.id)` resolves per-row against every row that passes the `status = 'active'` filter. This is a scalar correlated subquery executing O(n) times. `orders` table size not confirmed in schema, so no small-table downgrade. → **FINDING #2 — AP-HIGH**
**SCAN-03 — Leading-wildcard LIKE:** `email LIKE '%@yahoo.com'` is a leading-wildcard pattern on `customers(email)`, which is a confirmed indexed column. This forces an index bypass. → **FINDING #3 — AP-HIGH**
**SCAN-04 — Function on indexed column:** No `func(col)` predicate present. ✓ Clear.
**SCAN-05 — OR on indexed columns:** No OR predicate present. ✓ Clear.
**SCAN-06 — N+1 pattern:** The correlated subquery in the WHERE clause re-executes per outer row — this is the in-query N+1 variant. Classification rule: N+1 inside a single query's subquery = **AP-CRITICAL.** → **FINDING #4 — AP-CRITICAL** *(overlaps root cause with FINDING #2; merged — see self-reflection below)*
**SCAN-07 — Missing covering index:** The WHERE clause uses `status` and `email` with an ORDER BY absent, but the SELECT emits all columns. No single index covers `(status, email)` together — a composite would allow index-only filtering before the correlated subquery runs. Schema confirms `customers(status)` and `customers(email)` exist as separate indexes. → **FINDING #5 — AP-HIGH** [SCHEMA-CONF — columns confirmed]
**SCAN-08 — Implicit type cast in JOIN:** The correlated subquery joins on `o.customer_id = c.id`. Schema: `customers.id INT`, `orders.customer_id INT`. Types match. ✓ Clear.
**SCAN-09 — DISTINCT as JOIN fix:** No DISTINCT present. ✓ Clear.
**SCAN-10 — Unbound OFFSET pagination:** No LIMIT/OFFSET present. ✓ Clear.
---
**Self-reflection loop:**
- Q1: No invented columns — all rewrite targets are schema-confirmed.
- Q2: FINDING #2 and FINDING #4 share the same root cause (correlated subquery). **Merged into FINDING #2/4 — one CRITICAL finding.** The AP-CRITICAL classification wins; AP-HIGH is subsumed.
- Q3: Severity density check — 4 findings (1 CRITICAL, 2 HIGH, 1 MEDIUM equivalent) across a 1-line query. Density is high; no re-examination needed.
- Q4: Every AP-CRITICAL finding (FINDING #2/4) has a concrete rewrite candidate in Pass 2. ✓
---
## PHASE 3: PASS 2 — REWRITE + PASS 3 — VERIFY
**FINDING #1 — SELECT \*:**
Rewrite: Replace with explicit column list `c.id, c.email, c.name, c.status`.
Rewrite certainty: HIGH — mechanical substitution, no logic change.
Behavioral preservation: HIGH — same columns emitted (all 4 non-PK columns + id). Dashboard consuming all columns will be unaffected.
**FINDING #2/4 — Correlated subquery / N+1 CRITICAL:**
Rewrite: Convert to a JOIN against a pre-aggregated subquery (derived table). The subquery runs once, aggregates `SUM(total)` per `customer_id` across all relevant orders, then joins to the outer `customers` filter. This eliminates per-row execution.
Rewrite certainty: HIGH — standard N+1 → derived-table JOIN transformation.
Behavioral preservation: MEDIUM — SUM behavior is equivalent; however, customers with zero orders will behave differently: the correlated subquery returns NULL (not > 5000, so excluded), while a LEFT JOIN + COALESCE preserves that exclusion. An INNER JOIN against the aggregation achieves the same exclusion without COALESCE. Emitting INNER JOIN variant as primary; LEFT JOIN with COALESCE listed as CONDITIONAL.
**FINDING #3 — Leading-wildcard LIKE:**
Rewrite: No index-preserving rewrite exists for `LIKE '%@yahoo.com'` in standard PostgreSQL. Options: (a) `LIKE '%@yahoo.com'` with a `pg_trgm` trigram index (`GIN` on `email`) — index-assisted but requires extension; (b) reverse the string and index `reverse(email)` with `LIKE 'moc.oohay@%'` — index-usable, but requires a functional index and application-layer reversal. Neither option is a pure SQL rewrite — both require schema changes. Flagging as [CONDITIONAL — SCHEMA-CHANGE-REQUIRED].
Rewrite certainty: MEDIUM — fix is known but requires DBA action.
Behavioral preservation: HIGH — either approach returns identical rows.
**FINDING #5 — Missing composite index:**
Composite index on `(status, email)` would allow PostgreSQL to filter `status = 'active'` then scan only matching email values, reducing rows passed to the aggregation JOIN.
Certainty: HIGH — columns confirmed in schema, selectivity of status filter is high on a loyalty dashboard context.
Behavioral preservation: N/A — index proposal only.
---
## PHASE 4: OUTPUT
---
### OPTIMIZATION SUMMARY
```
Query lines analyzed: 1 (single-statement; clause-level analysis applied)
Anti-patterns detected: 4
CRITICAL: 1
HIGH: 2
MEDIUM: 1
LOW: 0
Rewrites proposed: 3
Rewrites verified: 2 (67% behavioral preservation confirmed)
1 rewrite CONDITIONAL (schema change required — FINDING #3)
Findings deferred [UNCONF]: 0
SEVERITY DENSITY: 3.0 HIGH+CRITICAL per 100 lines (elevated)
DATABASE ENGINE: PostgreSQL
```
---
### FINDINGS MATRIX
```
╔══════════════════════════════════════════════╦══════════════════════╦═══════════╦═══════════╗
║ FINDING ║ LOCATION ║ SEVERITY ║ SCAN TYPE ║
╠══════════════════════════════════════════════╬══════════════════════╬═══════════╬═══════════╣
║ Correlated subquery executes per outer row ║ WHERE clause — ║ CRITICAL ║ SCAN-06 ║
║ (N+1 in-query pattern; O(n) subquery cost) ║ (SELECT SUM(total)…) ║ ║ SCAN-02 ║
╠══════════════════════════════════════════════╬══════════════════════╬═══════════╬═══════════╣
║ Leading-wildcard LIKE bypasses email index ║ WHERE — LIKE ║ HIGH ║ SCAN-03 ║
║ Forces full customers scan on indexed col ║ '%@yahoo.com' ║ ║ ║
╠══════════════════════════════════════════════╬══════════════════════╬═══════════╬═══════════╣
║ No composite index covering (status, email) ║ WHERE — status + ║ HIGH ║ SCAN-07 ║
║ Two separate indexes; optimizer must choose ║ email predicates ║ ║ ║
╠══════════════════════════════════════════════╬══════════════════════╬═══════════╬═══════════╣
║ SELECT * on hot-path dashboard query ║ SELECT clause ║ MEDIUM ║ SCAN-01 ║
║ Prevents index-only scan; transfers all cols ║ ║ ║ ║
╚══════════════════════════════════════════════╩══════════════════════╩═══════════╩═══════════╝
```
---
### OPTIMIZED QUERY
```sql
-- REWRITE SUMMARY:
-- FINDING #2/4 (CRITICAL): Correlated subquery converted to pre-aggregated
-- derived table JOIN — eliminates per-row subquery execution.
-- FINDING #1 (MEDIUM): SELECT * replaced with explicit column list.
-- FINDING #3 (HIGH): LIKE pattern left structurally unchanged — index-bypass
-- cannot be resolved without schema change (see INDEX PROPOSALS).
-- Comment added to flag for DBA review.
SELECT
c.id,
c.email,
c.name,
c.status
-- FINDING #1: Explicit columns replace SELECT * — prevents wide-column
-- scan and enables future index-only scan if covering index is added.
FROM customers c
INNER JOIN (
-- FINDING #2/4: Pre-aggregation runs once across all orders,
-- not once per customer row. INNER JOIN preserves the original
-- exclusion of customers with no orders (NULL SUM → no join row).
SELECT
customer_id,
SUM(total) AS total_spend
FROM orders
GROUP BY customer_id
HAVING SUM(total) > 5000
) AS loyal_orders
ON loyal_orders.customer_id = c.id
WHERE
c.status = 'active'
-- FINDING #3: Leading-wildcard LIKE — index on customers(email) is
-- bypassed. Full customers scan occurs here. Mitigation requires
-- pg_trgm GIN index or reverse-string functional index (see INDEX
-- PROPOSALS). Pattern preserved unchanged pending DBA decision.
AND c.email LIKE '%@yahoo.com';
```
---
### INDEX PROPOSALS
```sql
-- FINDING #5: Composite index — allows PostgreSQL to satisfy both
-- status = 'active' and LIKE predicate scan in one index traversal.
-- status first (equality predicate) maximises prefix selectivity.
-- Confirmed columns: customers.status, customers.email (schema-confirmed).
CREATE INDEX idx_customers_status_email
ON customers (status, email);
-- FINDING #3 (CONDITIONAL — requires pg_trgm extension):
-- Trigram GIN index enables index-assisted LIKE '%@yahoo.com' scans.
-- Run: CREATE EXTENSION IF NOT EXISTS pg_trgm; first.
-- [CONDITIONAL — DBA decision required before creating]
CREATE INDEX idx_customers_email_trgm
ON customers USING GIN (email gin_trgm_ops);
-- ALTERNATIVE to trigram (no extension required
By purchasing this skill, you agree to our terms of service
CLAUDE-4-6-SONNET
Loop-Driven Anti-Pattern Rewriter
Finds Slow Patterns, Rewrites, Verifies
3-Pass Rewrite Engine with Verdict
...more
Added 11 hours ago
