PromptBase
Upgrade
Close icon
General
Home
Marketplace
Create
Hire
Login
Chat
Sell
Explore

Prompt Details

Model
(claude-4-5-sonnet)
Token size
606
Example input
[DATA_MODEL]: Users, Accounts, Tenants, Events, EventAggregates. TenantId on most tables. Events table grows fast. [WORKLOAD:]: 95% reads. Dashboard queries filter by TenantId + date range. Heavy aggregations. Peak concurrency. [CONSTRAINTS]: p95 < 200ms, predictable cost, rapid growth (10x in 12 months). [RISKS:]: Slow dashboard queries, expensive aggregates, index bloat concerns.
Example output
# Database Schema Review ## 1. SCHEMA HEALTH CHECK **Normalization Assessment:** - Events table appears to be an append-only fact table—appropriate for time-series data - EventAggregates suggests precomputation layer—good for read-heavy workload - Tenant/Account/User hierarchy unclear—validate whether Users → Accounts → Tenants or different structure to avoid JOIN proliferation **Primary Key & Constraint Issues:** - Events table: Confirm surrogate key (UUID/BIGINT) with composite unique constraint on (TenantId, EventId, Timestamp) to prevent duplicates - EventAggregates: Missing detail on grain (daily/hourly/per-tenant)—validate uniqueness constraint - TenantId foreign keys: Ensure NOT NULL + indexed on all tables with referential integrity enforced - Timestamp fields: Should be NOT NULL with timezone awareness (TIMESTAMPTZ if Postgres-like) **Critical Gaps:** - No mention of soft-delete patterns—if using `deleted_at`, complicates indexing and queries - EventAggregates refresh strategy unclear—incremental vs. full rebuild impacts locking - JSON/JSONB columns unspecified—if present in Events, risk of unbounded growth and scan overhead **Risk Patterns:** - Events table unbounded growth without partitioning = table bloat + vacuum overhead - Missing archival policy will degrade query performance within 6 months at 10x growth - TenantId cardinality unknown—high cardinality with skewed distribution = index bloat --- ## 2. PERFORMANCE & INDEX STRATEGY **Critical Indexes (Must-Have):** 1. **Events**: `(TenantId, created_at DESC) INCLUDE (event_type, payload_summary)` [Confidence: 9/10] - Covers dashboard date-range filters, enables index-only scans 2. **EventAggregates**: `(TenantId, period_start DESC)` [Confidence: 9/10] - Fastest path for pre-aggregated dashboard data 3. **Tenants/Accounts/Users**: `(id)` primary key + `(tenant_id)` foreign key indexes [Confidence: 10/10] **Conditional Indexes:** 4. **Events**: Partial index `WHERE created_at > NOW() - INTERVAL '90 days'` [Confidence: 7/10] - If queries rarely touch old data, reduces index size 70%+ 5. **Events**: `(TenantId, event_type, created_at)` if filtering by event_type is common [Confidence: 6/10] **Anti-Patterns to Avoid:** - Index on `(created_at, TenantId)` - wrong column order for tenant-scoped queries - Separate indexes on TenantId and created_at—redundant with composite index - Full-table index on Events without partitioning—bloat risk within 12 months **Query Pattern Risks:** - Heavy aggregations (SUM/AVG/COUNT) on Events = sequential scans if not using EventAggregates - Missing LIMIT/pagination on dashboard queries = memory exhaustion risk at scale - N+1 risk if fetching related User/Account data per event row --- ## 3. DATA INTEGRITY & CONSISTENCY **Missing Constraints:** - **Foreign Keys**: TenantId relationships must be enforced unless hard multi-tenancy isolation exists - **Check Constraints**: Date ranges (start_date < end_date), enumerated event_types, non-negative counts - **Unique Constraints**: Events may lack deduplication key (idempotency risk if clients retry) **Transaction Boundary Concerns:** - EventAggregates updates: If computed asynchronously, validate stale-read tolerance (5-10 min lag acceptable?) - Multi-tenant writes: Confirm isolation level prevents cross-tenant data leakage (READ COMMITTED minimum) - Event insertion + aggregate update: Should be eventual consistency, not atomic (performance trade-off) **Consistency Risks:** - Aggregates out-of-sync with Events if background jobs fail—need reconciliation process - Orphan Events if Tenant deleted without cascade—requires soft-delete or cleanup job - Duplicate Events if retry logic lacks idempotency keys (insert with ON CONFLICT) **Recommendations:** - Add `idempotency_key` (UUID) to Events with UNIQUE constraint - Implement aggregate versioning or checksum validation - Use advisory locks or queue-based processing for aggregate updates --- ## 4. SCALABILITY & OPERABILITY **Partitioning Strategy (High Priority):** - **Events**: Range partition by `created_at` (monthly) with TenantId sub-partitioning if supported - Enables DROP PARTITION for archival (faster than DELETE) - Partition pruning accelerates date-range queries 5-10x - Required before 10x growth or maintenance windows become untenable - **EventAggregates**: Consider materialized view or table partitioned by period [Confidence: 8/10] **Scaling Bottlenecks:** - Events table: Single-table inserts will hit write throughput limits (10k+ rows/sec) - Mitigation: Batch inserts, connection pooling, async writes - Index bloat: B-tree indexes on high-churn tables degrade 20-30% without REINDEX - Mitigation: Autovacuum tuning, scheduled REINDEX on low-traffic windows - Lock contention: EventAggregates updates may block concurrent reads - Mitigation: Use CONCURRENTLY for index builds, lock-free reads via replication lag tolerance **Growth Projections:** - 10x data growth = 10x storage + 3-5x index size (40-60% overhead) - Dashboard queries degrade logarithmically without partitioning (200ms → 2s within 18 months) - Connection pool exhaustion risk if per-tenant connections scale linearly **Monitoring Signals:** - **Critical**: p95/p99 query latency by TenantId, index bloat ratio (>30% = reindex), replication lag (>5s = read replica issue) - **Important**: Sequential scans on Events (should be <1%), cache hit ratio (<95% = undersized buffers), table/index size growth rate - **Nice-to-have**: Per-tenant query cost distribution (identify noisy neighbors), autovacuum run frequency **Confidence in Scalability:** 6/10 without partitioning, 8/10 with partitioning + aggregate layer --- ## 5. MIGRATION & EVOLUTION PLAN **High-Risk Changes (Require Downtime or Extended Locks):** 1. Adding NOT NULL constraints to existing columns (requires full table scan + validation) 2. Creating non-CONCURRENT indexes on Events table (exclusive lock blocks reads/writes) 3. Partitioning existing Events table (requires table rebuild or pg_partman-style migration) **Safe Migration Sequence:** 1. **Phase 1 (Zero Downtime):** - Create new indexes CONCURRENTLY - Add nullable columns for future constraints - Deploy EventAggregates table + backfill job (async) 2. **Phase 2 (Validation):** - Run dual-write to old + new schema - Validate data consistency (shadow mode) - Monitor performance impact (<5% overhead acceptable) 3. **Phase 3 (Cutover):** - Add NOT NULL constraints with table lock (schedule during low-traffic window) - Drop legacy indexes (instant, no lock) - Enable partitioning on Events (requires rewrite—use pg_partman or app-level partitioning) **Rollback Strategies:** - New indexes: DROP CONCURRENTLY (instant) - Schema changes: Maintain backward-compatible views for 2 release cycles - Partitioning: Keep monolithic table until partition pruning validated in production **Backwards Compatibility:** - Use database views to abstract schema changes from application code - Feature flags for new query paths (gradual rollout) - Keep old indexes until new ones proven stable (1-2 weeks) --- ## 6. PRIORITIZED FIX LIST ### **MUST-FIX (P0 - Next Sprint)** 1. **Add composite index on Events (TenantId, created_at DESC)** [Impact: High, Effort: Low] - Unblocks 95% of dashboard queries, reduces p95 from ~2s → <200ms - Trade-off: 15-20% storage overhead, negligible write impact 2. **Implement partitioning on Events table (monthly)** [Impact: Critical, Effort: High] - Prevents 12-month performance cliff, enables cheap archival - Trade-off: 1-2 week migration effort, requires app-side partition awareness 3. **Add idempotency_key UNIQUE constraint to Events** [Impact: High, Effort: Low] - Prevents duplicate event insertions during retries - Trade-off: Slightly increased insert latency (1-2ms), storage for UUID column ### **HIGH PRIORITY (P1 - Next Quarter)** 4. **Optimize EventAggregates refresh strategy** [Impact: High, Effort: Medium] - Switch from full rebuild to incremental updates (10x faster) - Trade-off: Complexity in handling late-arriving events 5. **Enforce NOT NULL on TenantId columns** [Impact: Medium, Effort: Medium] - Closes data integrity gap, enables query planner optimizations - Trade-off: Requires brief table lock, must backfill nulls first 6. **Add partial index on Events (last 90 days)** [Impact: Medium, Effort: Low] - Reduces index bloat 70%, maintains query performance - Trade-off: Queries outside 90-day window slower (acceptable if rare) ### **NICE-TO-HAVE (P2 - Backlog)** 7. **Implement read replicas for dashboard queries** [Impact: Medium, Effort: High] - Offloads read traffic, improves availability during maintenance - Trade-off: Adds operational complexity, 5-10s replication lag acceptable? 8. **Add event_type index if filtering is common** [Impact: Low, Effort: Low] - Only if >20% of queries filter by event_type (validate with query logs) 9. **Normalize User → Account → Tenant hierarchy** [Impact: Low, Effort: High] - Reduces JOIN complexity if current structure causes N+1 queries - Trade-off: Major refactor, only justified if multi-tenancy model is broken --- ## KEY ASSUMPTIONS & GAPS **Assumptions Made:** - Events table is append-only (no UPDATEs/DELETEs) - Dashboard queries always scope to single TenantId (no cross-tenant analytics) - EventAggregates are acceptable to be 5-10 minutes stale - Database is relational (Postgres/MySQL-like), not NoSQL - Write volume is <5k events/second per tenant **Critical Missing Information:** 1. **TenantId cardinality & distribution** - Affects index strategy (high cardinality = BRIN vs B-tree consideration) 2. **Event payload schema** - JSON columns? If so, need GIN/GIST indexes for JSONB queries 3. **Aggregate grain** - Hourly/daily/per-event-type? Affects table size and query patterns 4. **Retention policy** - Legal/compliance requirements for data archival (GDPR, SOC2) 5. **Database engine & version** - Partitioning syntax, CONCURRENT index support, vacuum behavior differ 6. **Current table sizes** - Events/EventAggregates row count and disk usage informs urgency 7. **Multi-tenancy model** - Shared schema vs. schema-per-tenant vs. database-per-tenant 8. **Replication setup** - Read replicas, failover strategy, acceptable lag tolerance **Recommend Next Steps:** - Run EXPLAIN ANALYZE on 5 slowest dashboard queries to validate index strategy - Capture 1 week of query logs to identify actual filter patterns - Measure current index bloat ratio and autovacuum effectiveness - Define tenant churn rate (new/deleted tenants) for foreign key cascade strategy
🌀 Claude

Database Schema Optimizer

Add to Cart
Instant accessInstant access
Usage rightsCommercial use
Money-back guaranteeMoney‑back
By purchasing this prompt, you agree to our terms of service
CLAUDE-4-5-SONNET
Tested icon
Guide icon
4 examples icon
Free credits icon
Optimize and review database schemas for enterprise-grade performance, scalability, and maintainability. This prompt audits tables, indexes, relationships, query patterns, and data growth assumptions—then produces prioritized fixes to reduce latency, prevent integrity issues, and improve long-term evolution (migrations, versioning, and operational safety).
...more
Added over 1 month ago
Report
Browse Marketplace