Prompt Details
Model
Chat - (gpt-5.5)
Token size
1,460
Example input
[recipient role]: Junior analyst
[business context]: E-commerce company tracking high value customer purchasing behavior for loyalty program targeting
[paste SQL query here]: SELECT customer_id, COUNT(order_id) AS total_orders, SUM(order_value) AS total_spent, AVG(order_value) AS avg_order_value, MAX(order_date) AS last_order_date FROM orders WHERE order_status = 'completed' AND order_date >= '2026-01-01' GROUP BY customer_id HAVING SUM(order_value) > 10000 ORDER BY total_spent DESC LIMIT 50;
Example output
# π SQL INTELLIGENCE BREAKDOWN REPORT
βββββββββββββββββββββββββββββββββββ
## SECTION 0 β QUERY VALIDATION CHECK
βββββββββββββββββββββββββββββββββββ
### SYNTAX CHECK
**State: PASSED β
**
Reasoning:
* All SQL clauses are in the correct order.
* Aggregation functions are used correctly.
* `GROUP BY` matches the non-aggregated column (`customer_id`).
* `HAVING` is correctly applied after aggregation.
* `ORDER BY` correctly references an alias (`total_spent`).
* `LIMIT` syntax is valid.
---
### LOGIC CHECK
**State: WARNING β οΈ**
Reasoning:
The query is logically correct overall, but there are a few potential concerns:
1. **Date filtering may exclude historical high-value customers**
* The query only considers orders after `'2026-01-01'`.
* Customers who spent heavily before that date will not appear.
2. **Possible currency inconsistency**
* If `order_value` contains multiple currencies or unnormalized values, `SUM(order_value)` may become misleading.
3. **Potential duplicate order counting**
* If `order_id` is not unique in the `orders` table due to joins or data duplication, `COUNT(order_id)` may overcount orders.
---
### COMPLETENESS CHECK
**State: COMPLETE β
**
Reasoning:
The query includes:
* Status filtering
* Time period filtering
* Customer aggregation
* High-value customer qualification
* Ranking and limiting results
These align well with the stated loyalty-program targeting use case.
---
### VALIDATION VERDICT
## β οΈ HAS WARNINGS β runs but may produce unexpected results
### Issues Found & Suggested Corrections
#### Issue 1 β Possible duplicate order counting
**Current line:**
```sql
COUNT(order_id) AS total_orders
```
**Safer version:**
```sql
COUNT(DISTINCT order_id) AS total_orders
```
Reason:
Prevents inflated order counts if duplicates exist.
---
#### Issue 2 β Date filter may unintentionally exclude valuable customers
**Current line:**
```sql
AND order_date >= '2026-01-01'
```
**Possible business-aligned improvement:**
```sql
AND order_date >= CURRENT_DATE - INTERVAL '12 months'
```
Reason:
Creates a rolling loyalty analysis window instead of a fixed calendar cutoff.
---
## Corrected Working Version Used for Remaining Analysis
```sql
SELECT
customer_id,
COUNT(DISTINCT order_id) AS total_orders,
SUM(order_value) AS total_spent,
AVG(order_value) AS avg_order_value,
MAX(order_date) AS last_order_date
FROM orders
WHERE order_status = 'completed'
AND order_date >= '2026-01-01'
GROUP BY customer_id
HAVING SUM(order_value) > 10000
ORDER BY total_spent DESC
LIMIT 50;
```
βββββββββββββββββββββββββββββββββββ
## SECTION 1 β QUERY INTELLIGENCE SNAPSHOT
βββββββββββββββββββββββββββββββββββ
* **Query purpose:** Identify the top high-spending customers eligible for loyalty targeting based on completed purchases in 2026.
* **Complexity rating:** Intermediate
* **Number of tables involved:** 1 table (`orders`)
* **Number of conditions applied:** 3 conditions
1. `order_status = 'completed'`
2. `order_date >= '2026-01-01'`
3. `SUM(order_value) > 10000`
* **Output type:** Aggregated summary ranked list
βββββββββββββββββββββββββββββββββββ
## SECTION 2 β CLAUSE BY CLAUSE BREAKDOWN
βββββββββββββββββββββββββββββββββββ
### SELECT
**Technical function:**
Chooses the columns and calculated metrics returned in the final output.
**Important business detail:**
This section creates customer-level KPIs like spending, order frequency, and purchase recency.
Included calculations:
* `COUNT(DISTINCT order_id)` β total completed orders
* `SUM(order_value)` β total customer spending
* `AVG(order_value)` β average basket size
* `MAX(order_date)` β most recent purchase date
---
### FROM
**Technical function:**
Specifies the source table being queried.
**Important business detail:**
All analysis comes from the `orders` transaction table.
---
### WHERE
**Technical function:**
Filters rows before aggregation happens.
**Important business detail:**
Only completed orders from 2026 onward are included in the analysis.
Conditions:
```sql
order_status = 'completed'
```
Excludes cancelled, refunded, or pending orders.
```sql
order_date >= '2026-01-01'
```
Limits analysis to recent customer activity.
---
### GROUP BY
**Technical function:**
Groups rows by customer so aggregate calculations can be performed.
**Important business detail:**
Instead of analyzing individual orders, the business now analyzes customer purchasing behavior.
```sql
GROUP BY customer_id
```
---
### HAVING
**Technical function:**
Filters aggregated groups after calculations are completed.
**Important business detail:**
Unlike `WHERE`, which filters individual rows, `HAVING` filters summarized customer results.
Condition:
```sql
HAVING SUM(order_value) > 10000
```
Plain English:
βOnly keep customers whose total spending exceeds βΉ10,000.β
Difference from WHERE:
* `WHERE` filters raw transactions
* `HAVING` filters aggregated customer summaries
---
### ORDER BY
**Technical function:**
Sorts final results.
**Important business detail:**
The highest-value customers appear first.
```sql
ORDER BY total_spent DESC
```
---
### LIMIT
**Technical function:**
Restricts output row count.
**Important business detail:**
Returns only the top 50 customers for focused loyalty targeting.
```sql
LIMIT 50
```
βββββββββββββββββββββββββββββββββββ
## SECTION 3 β BUSINESS INTELLIGENCE LAYER
βββββββββββββββββββββββββββββββββββ
### BUSINESS QUESTION ANSWERED
Which customers generated the highest completed purchase value in 2026 and should be prioritized for loyalty or retention campaigns?
---
### WHO USES THIS AND HOW
**Primary user:**
A CRM manager, loyalty program manager, or marketing analyst.
**Decision enabled:**
They would use this output to:
* Launch VIP loyalty campaigns
* Offer premium rewards
* Create personalized retention strategies
* Identify top customer segments
---
### DATA STORY
This query highlights the customers contributing the most revenue through completed purchases. By combining total spend, order frequency, and recent purchase activity, the business can distinguish between loyal repeat buyers and occasional high-ticket shoppers. The output helps the company focus marketing budget and loyalty incentives on customers most likely to generate long-term revenue.
βββββββββββββββββββββββββββββββββββ
## SECTION 4 β QUERY QUALITY SCORECARD
βββββββββββββββββββββββββββββββββββ
| Dimension | Score | Assessment |
| ---------------- | ----- | ------------------------------------------------------------------------------------------------------------- |
| Readability | 9/10 | Clean formatting and intuitive aliases make the query easy to follow. |
| Efficiency | 7/10 | Efficient for moderate datasets, but aggregation scans may become expensive on large tables without indexing. |
| Accuracy Risk | 7/10 | Mostly reliable, but duplicate order IDs and date filtering assumptions may distort results. |
| Business Clarity | 9/10 | Strong alignment between business objective and query logic. |
# OVERALL QUERY SCORE: 32/40
Overall assessment:
A strong intermediate-level business analytics query that effectively identifies high-value customers, with minor improvements needed for scalability and data-quality protection.
βββββββββββββββββββββββββββββββββββ
## SECTION 5 β OPTIMIZATION AND RISK FLAGS
βββββββββββββββββββββββββββββββββββ
### OPTIMIZATION SUGGESTION
#### Recommendation:
Add an index on:
```sql
(order_status, order_date, customer_id)
```
#### Why this helps:
The query filters heavily on:
* `order_status`
* `order_date`
and groups by:
* `customer_id`
A composite index reduces scan time and improves aggregation performance significantly on large e-commerce datasets.
---
### Improved Query Version
```sql
SELECT
customer_id,
COUNT(DISTINCT order_id) AS total_orders,
SUM(order_value) AS total_spent,
AVG(order_value) AS avg_order_value,
MAX(order_date) AS last_order_date
FROM orders
WHERE order_status = 'completed'
AND order_date >= '2026-01-01'
GROUP BY customer_id
HAVING SUM(order_value) > 10000
ORDER BY total_spent DESC
LIMIT 50;
```
---
## DATA RISK FLAGS
### Risk 1
#### What could go wrong
Duplicate rows may inflate customer metrics.
#### When it becomes a problem
If the orders table contains accidental duplicate transactions or partial order records.
#### How to prevent it
Use:
```sql
COUNT(DISTINCT order_id)
```
and enforce primary key constraints on `order_id`.
---
### Risk 2
#### What could go wrong
Refunded or partially refunded orders may still appear as completed revenue.
#### When it becomes a problem
If refund processing does not update `order_status`.
#### How to prevent it
Include refund logic such as:
```sql
AND refund_status = 'not_refunded'
```
or subtract refunded amounts in aggregation logic.
βββββββββββββββββββββββββββββββββββ
## SECTION 6 β PLAIN ENGLISH MASTER SUMMARY
βββββββββββββββββββββββββββββββββββ
This query works like a VIP customer ranking system for an e-commerce company. Imagine the business is reviewing every completed purchase made since the start of 2026 and creating a performance summary for each customer. It calculates how many orders each customer placed, how much they spent in total, their average purchase size, and when they last ordered. Then it removes customers who spent less than βΉ10,000 and ranks the remaining customers from highest spender to lowest spender. Finally, it shows only the top 50 customers so the loyalty team can focus on the most valuable buyers first. A business team should use this result to target premium customers with retention offers, rewards, and personalized engagement campaigns.
## COMPREHENSION CONFIDENCE SCORE: 9/10
**Reasoning:**
The query is structurally strong, business-aligned, and relatively easy to interpret despite using aggregation and post-group filtering. Minor warnings exist around data quality assumptions, but overall the logic is clear and reliable for analytical use.
By purchasing this prompt, you agree to our terms of service
GPT-5.5
Paste any SQL query and get a complete 7-section intelligence breakdown β automatic syntax and logic validation, clause by clause explanation, business intelligence layer, quality scorecard, optimization suggestion, data risk flags and plain English summary. Auto-detects broken queries and suggests fixes before explaining. Works for any skill level and any database type.
...more
Added 1 week ago
