Prompt Details
Model
(claude-4-6-sonnet)
Token size
2,843
Example input
[TECH_STACK]: Python / dbt / Great Expectations / Snowflake / Airflow
[ANALYSIS_SCOPE]: staging layer + marts/customer_revenue models + warehouse load validation
[INPUT_AVAILABLE]: tree
[QUALITY_PRIORITY]: completeness | validity | uniqueness
Example output
I'll work through the operating mode detection and setup before any analysis.
---
```
OPERATING MODE: 1 — Tree-First
Justification: INPUT_AVAILABLE = tree. No pipeline files, dbt model SQL,
Great Expectations suites, or test configs are in context. All coverage
findings below are {Speculative: tree-only} until Phase 2 files are supplied.
QUALITY DIMENSIONS IN SCOPE: Completeness | Validity | Uniqueness
(Accuracy and Timeliness excluded per QUALITY_PRIORITY; flagged if
structural signals make gaps visible)
FILES ASSESSED (Direct): NONE
FILES IN TREE ONLY (Speculative): awaiting your tree output
FILES NOT AVAILABLE: all pipeline files, dbt models, GE suites, sources.yml,
schema.yml test blocks, Airflow DAG definitions
```
---
## ⚠️ Tree Not Yet Received
You've declared `[INPUT_AVAILABLE] = tree` but no tree output is present in your message. I cannot speculate on folder topology, naming conventions, or coverage gaps without it.
**Do not proceed to analysis until the tree is provided.** Fabricating structure would violate the strict constraints of this framework.
---
## What to Paste
Run one of the following in your project root and paste the full output:
**Option A — Standard Unix tree**
```bash
tree -a -I '__pycache__|*.pyc|.git|.venv|node_modules' \
--dirsfirst -L 5
```
**Option B — find fallback (if tree is not installed)**
```bash
find . -type f \
\( -name "*.py" -o -name "*.sql" -o -name "*.yml" \
-o -name "*.yaml" -o -name "*.json" \) \
| grep -v -E '(__pycache__|\.git|\.venv)' \
| sort
```
---
## What I Will Do With It (Phase 1 Preview)
Once the tree arrives, I will:
1. **Map your staging + marts layer topology** — identify extract, transform, and load steps by folder and file naming conventions
2. **Locate test coverage signals** — `schema.yml`, `tests/` directories, GE checkpoint YAMLs, `expectations/` folders, `sources.yml` freshness blocks
3. **Flag unguarded high-risk nodes** — models with likely Fan-Out > 3 and no visible test file sibling
4. **Identify the 5–10 highest-priority files** to request for Phase 2 Direct analysis, ranked by:
- Steps feeding downstream mart models (high blast radius)
- Staging entry points with no adjacent `schema.yml`
- Any load-to-Snowflake step with no visible assertion layer
5. **Produce a speculative coverage map** clearly marked `{Speculative: tree-only}` with `{Confidence: LOW | MEDIUM}` on every inference
---
Paste your tree output and Phase 1 analysis begins immediately.
By purchasing this prompt, you agree to our terms of service
CLAUDE-4-6-SONNET
✅ Maps quality check coverage across every pipeline step — completeness, accuracy, consistency, timeliness, validity, and uniqueness
✅ Identifies silent failures: checks that log a warning and let corrupt data through instead of halting the pipeline
✅ Flags unguarded high-risk steps — steps with high Fan-Out and zero quality checks — before they become incidents
✅ Every remediation includes concrete check implementation code, not just abstract recommendations
✅ Each proposed blocking check
...more
Added 1 week ago
