feat: Add AUDIT_ONLY model kind for multi-table validation #5362
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add AUDIT_ONLY Model Kind for Multi-Table Validation
Summary
This PR introduces a new
AUDIT_ONLY
model kind to SQLMesh, addressing the gap in validating relationships between multiple tables without materializing unnecessary tables. This feature combines the benefits of models (DAG participation, dependencies, scheduling) with audit behavior (validation without materialization).Problem Statement
Previously, SQLMesh users had to choose between:
Solution
The
AUDIT_ONLY
model kind enables users to:Implementation Details
Core Changes
1. Model Kind Definition (
sqlmesh/core/model/kind.py
)AUDIT_ONLY
toModelKindName
enumAuditOnlyKind
class with configuration:blocking
(default:True
): Whether failures stop the pipelinemax_failing_rows
(default:10
): Number of sample rows in error messagesis_symbolic=True
(no materialization)2. Execution Strategy (
sqlmesh/core/snapshot/evaluator.py
)AuditOnlyStrategy
extendingSymbolicStrategy
AuditError
with sample data if validation fails3. Parser Support (
sqlmesh/core/dialect.py
)AUDIT_ONLY
to list of model kinds that accept properties4. Snapshot Definition (
sqlmesh/core/snapshot/definition.py
)evaluatable
property to include audit-only modelsTesting
Unit Tests (
tests/core/test_model.py
)Integration Tests (
tests/core/test_integration.py
)Documentation
User Documentation Updates
docs/concepts/audits.md
: Added comprehensive AUDIT_ONLY section under Advanced Usagedocs/concepts/models/model_kinds.md
: Added detailed AUDIT_ONLY section with examplesdocs/reference/model_configuration.md
: Added AUDIT_ONLY configuration referenceExample Models (
examples/sushi/models/
)Added 3 demonstration models (all non-blocking for demo purposes):
audit_order_integrity.sql
: Validates referential integrityaudit_waiter_revenue_anomalies.sql
: Detects revenue anomaliesaudit_duplicate_orders.sql
: Identifies duplicate ordersUsage Example
Key Differences from Traditional Audits
audits/
directorymodels/
directoryMigration Path
Testing Instructions
Run unit tests:
Run integration tests:
Try the sushi examples:
Create a test AUDIT_ONLY model:
Related Issues
Addresses the need for multi-table validation without materialization.
Notes for Reviewers
Future Enhancements (Not in this PR)