Skip to content

[CHORE]: Establish database migration testing pipeline with rollback validation across SQLite, Postgres, and RedisΒ #252

@crivetimihai

Description

@crivetimihai

🧭 Chore Summary

Establish comprehensive database migration testing pipeline with rollback validation across SQLite, Postgres, and Redis: make migration-test to automatically test all migration paths, validate data integrity, and ensure safe rollback capabilities for production deployments.


🧱 Areas Affected

  • Database migration infrastructure / Make targets (make migration-test, make migration-rollback, make migration-perf)
  • CI/CD pipeline integration (GitHub Actions migration testing jobs)
  • Alembic migration scripts and validation
  • Multi-database testing environments (SQLite, Postgres, Redis)
  • Data integrity and consistency validation
  • Migration performance and rollback timing analysis
  • Production migration safety and deployment verification

βš™οΈ Context / Rationale

Database migrations are high-risk operations that can cause production outages if not thoroughly tested. A comprehensive migration testing pipeline validates forward migrations, rollback scenarios, data integrity, and performance across all supported database types. This prevents data loss, corruption, and downtime during deployments while ensuring migrations work consistently across SQLite (development), Postgres (production), and Redis (caching/sessions).

What is Migration Testing?
Systematic validation of database schema changes through automated testing of migration scripts, rollback procedures, and data integrity checks across multiple database engines and deployment scenarios.

Key Components:

  • Forward Migration Testing: Validate schema changes apply correctly
  • Rollback Testing: Ensure migrations can be safely reversed
  • Data Integrity Validation: Verify no data loss or corruption occurs
  • Cross-Database Compatibility: Test identical behavior across SQLite/Postgres/Redis
  • Performance Impact Analysis: Measure migration execution time and resource usage

Simple Migration Test Example:

def test_migration_forward_and_rollback():
    """Test migration 001_create_tools_table forward and rollback."""
    # Apply migration
    alembic.upgrade("001")
    assert table_exists("tools")
    
    # Test rollback
    alembic.downgrade("base")
    assert not table_exists("tools")

Advanced Migration Testing Setup:

class MigrationTestSuite:
    """Comprehensive migration testing across all database types."""
    
    DATABASES = {
        'sqlite': 'sqlite:///test.db',
        'postgres': 'postgresql://test:test@localhost:5432/testdb',
        'redis': 'redis://localhost:6379/0'
    }
    
    def test_migration_chain(self, db_type: str):
        """Test complete migration chain forward and backward."""
        db_url = self.DATABASES[db_type]
        
        # Test forward migrations
        for revision in self.get_migration_revisions():
            self.apply_migration(db_url, revision)
            self.validate_schema_state(db_url, revision)
            self.validate_data_integrity(db_url)
            
        # Test rollback migrations
        for revision in reversed(self.get_migration_revisions()):
            self.rollback_migration(db_url, revision)
            self.validate_rollback_integrity(db_url, revision)

MCPGateway Specific Migration Scenarios:

# Migration test matrix for mcpgateway
migration_test_scenarios:
  # Core schema migrations
  - name: "tools_table_lifecycle"
    description: "Tools table creation, modification, and deletion"
    migrations:
      - "001_create_tools_table"
      - "002_add_tools_metadata_columns" 
      - "003_add_tools_performance_index"
    test_data:
      - file: "test_data/sample_tools.json"
        records: 100
    rollback_validation:
      - verify_no_data_loss: true
      - check_foreign_key_constraints: true

  # Server management migrations  
  - name: "server_gateway_relationships"
    description: "Server and gateway relationship schema changes"
    migrations:
      - "004_create_servers_table"
      - "005_create_gateways_table"
      - "006_add_server_gateway_associations"
    test_data:
      - file: "test_data/server_configs.json"
        records: 50
    performance_thresholds:
      sqlite_max_duration: 1000  # ms
      postgres_max_duration: 5000
      
  # Redis session/cache migrations
  - name: "redis_schema_updates"
    description: "Redis key structure and caching schema changes"
    migrations:
      - "007_update_session_key_format"
      - "008_add_tool_response_caching"
    test_scenarios:
      - cache_migration_with_active_sessions
      - key_format_backward_compatibility

  # Performance-critical migrations
  - name: "large_table_modifications"
    description: "Schema changes on tables with significant data"
    migrations:
      - "009_add_tool_usage_statistics"
      - "010_partition_logs_table"
    test_data:
      - file: "test_data/large_dataset.json" 
        records: 10000
    performance_requirements:
      max_downtime_seconds: 30
      max_lock_duration_seconds: 5

πŸ“¦ Related Make Targets

Target Purpose
make migration-test Run complete migration test suite across all database types
make migration-rollback Test rollback scenarios and data integrity validation
make migration-perf Analyze migration performance and execution timing
make migration-validate Validate migration scripts syntax and compatibility
make migration-data-test Test migrations with realistic production-sized datasets
make migration-cross-db Verify identical behavior across SQLite, Postgres, Redis
make migration-safety Run pre-deployment migration safety checks
make migration-report Generate migration test reports and documentation
make migration-clean Clean migration test databases and temporary data
make migration-backup-test Test backup/restore procedures during migrations

Bold targets are mandatory; CI must fail if any migration test fails or rollback validation errors occur.


πŸ“‹ Acceptance Criteria

  • make migration-test validates all migrations forward and backward across SQLite, Postgres, and Redis.
  • make migration-rollback successfully tests rollback scenarios with zero data loss.
  • make migration-perf measures and reports migration execution times within acceptable limits.
  • Migration tests run automatically in CI with multiple database configurations.
  • Data integrity validation occurs before and after every migration step.
  • Large dataset migration testing validates production-scale performance.
  • Cross-database compatibility ensures identical behavior across all supported databases.
  • Migration safety checks prevent deployment of dangerous migrations.
  • Rollback procedures are tested and validated for all recent migrations.
  • Performance regression detection alerts when migrations become too slow.
  • Changelog entry under "Database" or "Testing".

πŸ› οΈ Task List (suggested flow)

  1. Migration testing infrastructure

    mkdir -p tests/migrations/{fixtures,data,reports}
    
    # Create migration test configuration
    cat > tests/migrations/config.yaml << 'EOF'
    databases:
      sqlite:
        url: "sqlite:///test_migrations.db"
        max_migration_time: 1000  # ms
      postgres:
        url: "postgresql://test:test@localhost:5432/migration_test"
        max_migration_time: 5000
      redis:
        url: "redis://localhost:6379/15"  # Use test database
        max_migration_time: 2000
    
    test_scenarios:
      - name: "empty_database"
        description: "Test migrations on fresh database"
        data_fixtures: []
        
      - name: "with_sample_data"
        description: "Test migrations with existing data"
        data_fixtures: ["sample_tools.json", "sample_servers.json"]
        
      - name: "large_dataset"
        description: "Test migration performance with large datasets"
        data_fixtures: ["large_tools.json"]
        performance_test: true
    EOF
  2. Makefile integration

    # Database Migration Testing Targets
    .PHONY: migration-test migration-rollback migration-perf migration-validate
    
    MIGRATION_TEST_DIR := tests/migrations
    MIGRATION_REPORTS := $(MIGRATION_TEST_DIR)/reports
    ALEMBIC_CONFIG := alembic.ini
    
    migration-test: migration-validate
    	@echo "πŸ—„οΈ  Running comprehensive migration tests..."
    	@mkdir -p $(MIGRATION_REPORTS)
    	@python -m pytest $(MIGRATION_TEST_DIR)/test_migrations.py \
    		--tb=short \
    		--junitxml=$(MIGRATION_REPORTS)/migration-results.xml \
    		-v
    
    migration-rollback:
    	@echo "βͺ Testing migration rollback scenarios..."
    	@python $(MIGRATION_TEST_DIR)/test_rollback.py \
    		--all-databases \
    		--verify-integrity \
    		--report-file $(MIGRATION_REPORTS)/rollback-report.json
    
    migration-perf:
    	@echo "⚑ Analyzing migration performance..."
    	@python $(MIGRATION_TEST_DIR)/performance_test.py \
    		--config $(MIGRATION_TEST_DIR)/config.yaml \
    		--output $(MIGRATION_REPORTS)/performance-report.html \
    		--large-dataset
    
    migration-validate:
    	@echo "βœ… Validating migration scripts..."
    	@alembic check
    	@python $(MIGRATION_TEST_DIR)/validate_migrations.py \
    		--alembic-config $(ALEMBIC_CONFIG) \
    		--check-syntax \
    		--check-dependencies
    
    migration-data-test:
    	@echo "πŸ“Š Testing migrations with production-scale data..."
    	@python $(MIGRATION_TEST_DIR)/large_data_test.py \
    		--generate-test-data 10000 \
    		--test-all-databases \
    		--measure-performance
    
    migration-cross-db:
    	@echo "πŸ”„ Testing cross-database migration compatibility..."
    	@python $(MIGRATION_TEST_DIR)/cross_db_test.py \
    		--compare-schemas \
    		--verify-behavior \
    		--report-differences
    
    migration-safety:
    	@echo "πŸ›‘οΈ  Running migration safety checks..."
    	@python $(MIGRATION_TEST_DIR)/safety_check.py \
    		--check-dangerous-operations \
    		--validate-constraints \
    		--check-data-loss-risk
    
    migration-clean:
    	@echo "🧹 Cleaning migration test artifacts..."
    	@rm -f test_migrations.db migration_test_*.db
    	@docker-compose -f docker-compose.test.yml down
    	@rm -rf $(MIGRATION_REPORTS)/*
  3. Comprehensive migration test suite

    # tests/migrations/test_migrations.py
    import pytest
    import alembic.command
    import alembic.config
    from sqlalchemy import create_engine, text
    from sqlalchemy.orm import sessionmaker
    import redis
    import json
    from pathlib import Path
    
    class TestMigrations:
        """Comprehensive migration testing across all database types."""
        
        @pytest.fixture(params=['sqlite', 'postgres'])
        def db_engine(self, request):
            """Create database engine for testing."""
            db_configs = {
                'sqlite': 'sqlite:///test_migrations.db',
                'postgres': 'postgresql://test:test@localhost:5432/migration_test'
            }
            
            engine = create_engine(db_configs[request.param])
            yield engine, request.param
            engine.dispose()
            
        @pytest.fixture
        def redis_client(self):
            """Create Redis client for testing."""
            client = redis.Redis(host='localhost', port=6379, db=15)
            client.flushdb()  # Clean test database
            yield client
            client.flushdb()
            client.close()
            
        def test_migration_forward_all_revisions(self, db_engine):
            """Test all migrations forward in sequence."""
            engine, db_type = db_engine
            alembic_cfg = alembic.config.Config("alembic.ini")
            
            # Start from base
            alembic.command.downgrade(alembic_cfg, "base")
            
            # Get all revisions
            revisions = self._get_migration_revisions(alembic_cfg)
            
            # Apply each migration and validate
            for revision in revisions:
                alembic.command.upgrade(alembic_cfg, revision)
                self._validate_schema_state(engine, revision)
                self._check_data_integrity(engine)
                
        def test_migration_rollback_sequence(self, db_engine):
            """Test rollback scenarios for recent migrations."""
            engine, db_type = db_engine
            alembic_cfg = alembic.config.Config("alembic.ini")
            
            # Apply all migrations
            alembic.command.upgrade(alembic_cfg, "head")
            
            # Insert test data
            self._insert_test_data(engine)
            initial_data = self._capture_data_snapshot(engine)
            
            # Test rollback of last 3 migrations
            revisions = self._get_migration_revisions(alembic_cfg)[-3:]
            
            for revision in reversed(revisions):
                alembic.command.downgrade(alembic_cfg, revision)
                self._validate_rollback_integrity(engine, revision, initial_data)
                
        def test_migration_with_existing_data(self, db_engine):
            """Test migrations with pre-existing data."""
            engine, db_type = db_engine
            alembic_cfg = alembic.config.Config("alembic.ini")
            
            # Set up initial schema and data
            alembic.command.upgrade(alembic_cfg, "001")  # Base tables
            self._load_fixture_data(engine, "sample_tools.json")
            
            # Capture data before migration
            pre_migration_data = self._capture_data_snapshot(engine)
            
            # Apply remaining migrations
            alembic.command.upgrade(alembic_cfg, "head")
            
            # Verify data preservation
            post_migration_data = self._capture_data_snapshot(engine)
            self._compare_data_integrity(pre_migration_data, post_migration_data)
            
        def test_redis_migration_scenarios(self, redis_client):
            """Test Redis schema/key format migrations."""
            # Simulate old key format
            redis_client.set("session:user:123", json.dumps({"data": "old_format"}))
            redis_client.set("cache:tool:456", json.dumps({"result": "cached"}))
            
            # Apply Redis migration (key format update)
            self._apply_redis_migration(redis_client, "007_update_session_key_format")
            
            # Verify new key format while preserving data
            assert redis_client.exists("session:v2:user:123")
            assert not redis_client.exists("session:user:123")
            
            # Verify data integrity
            session_data = json.loads(redis_client.get("session:v2:user:123"))
            assert session_data["data"] == "old_format"
            
        def test_large_dataset_migration_performance(self, db_engine):
            """Test migration performance with large datasets."""
            engine, db_type = db_engine
            alembic_cfg = alembic.config.Config("alembic.ini")
            
            # Set up base schema
            alembic.command.upgrade(alembic_cfg, "001")
            
            # Insert large dataset
            self._insert_large_dataset(engine, num_records=10000)
            
            # Measure migration performance
            import time
            start_time = time.time()
            
            alembic.command.upgrade(alembic_cfg, "head")
            
            migration_duration = time.time() - start_time
            
            # Performance thresholds by database type
            max_durations = {
                'sqlite': 10.0,    # 10 seconds
                'postgres': 30.0   # 30 seconds
            }
            
            assert migration_duration < max_durations[db_type], \
                f"Migration took {migration_duration:.2f}s, exceeds {max_durations[db_type]}s limit"
  4. Rollback testing framework

    # tests/migrations/test_rollback.py
    #!/usr/bin/env python3
    """
    Comprehensive rollback testing for database migrations.
    """
    
    import argparse
    import json
    import time
    from pathlib import Path
    from typing import Dict, Any, List
    import alembic.command
    import alembic.config
    from sqlalchemy import create_engine, text
    
    class RollbackTester:
        """Test migration rollback scenarios and data integrity."""
        
        def __init__(self, config_path: str):
            self.config = self._load_config(config_path)
            self.results = []
            
        def test_all_rollback_scenarios(self) -> List[Dict[str, Any]]:
            """Test rollback scenarios across all databases."""
            
            for db_name, db_config in self.config['databases'].items():
                if db_name == 'redis':
                    continue  # Redis doesn't use Alembic
                    
                print(f"πŸ”„ Testing {db_name} rollback scenarios...")
                
                engine = create_engine(db_config['url'])
                result = self._test_database_rollbacks(engine, db_name)
                self.results.append(result)
                engine.dispose()
                
            return self.results
            
        def _test_database_rollbacks(self, engine, db_name: str) -> Dict[str, Any]:
            """Test rollback scenarios for a specific database."""
            alembic_cfg = alembic.config.Config("alembic.ini")
            
            # Get recent migrations (last 5)
            revisions = self._get_recent_revisions(alembic_cfg, 5)
            
            test_result = {
                'database': db_name,
                'rollback_tests': [],
                'success': True,
                'errors': []
            }
            
            try:
                for revision in revisions:
                    rollback_test = self._test_single_rollback(
                        engine, alembic_cfg, revision
                    )
                    test_result['rollback_tests'].append(rollback_test)
                    
                    if not rollback_test['success']:
                        test_result['success'] = False
                        test_result['errors'].append(rollback_test['error'])
                        
            except Exception as e:
                test_result['success'] = False
                test_result['errors'].append(str(e))
                
            return test_result
            
        def _test_single_rollback(self, engine, alembic_cfg, revision: str) -> Dict[str, Any]:
            """Test rollback of a single migration."""
            print(f"  Testing rollback of revision {revision}...")
            
            rollback_result = {
                'revision': revision,
                'success': False,
                'duration': 0,
                'data_integrity_check': False,
                'error': None
            }
            
            try:
                # Apply migration and add test data
                alembic.command.upgrade(alembic_cfg, revision)
                self._insert_test_data(engine, revision)
                
                # Capture pre-rollback state
                pre_rollback_data = self._capture_critical_data(engine)
                
                # Perform rollback
                start_time = time.time()
                previous_revision = self._get_previous_revision(alembic_cfg, revision)
                alembic.command.downgrade(alembic_cfg, previous_revision or "base")
                rollback_result['duration'] = time.time() - start_time
                
                # Verify data integrity
                rollback_result['data_integrity_check'] = self._verify_rollback_integrity(
                    engine, pre_rollback_data, revision
                )
                
                rollback_result['success'] = True
                
            except Exception as e:
                rollback_result['error'] = str(e)
                
            return rollback_result
  5. Performance analysis

    # tests/migrations/performance_test.py
    #!/usr/bin/env python3
    """
    Migration performance analysis and reporting.
    """
    
    import time
    import psutil
    import matplotlib.pyplot as plt
    import pandas as pd
    from typing import Dict, List, Any
    import alembic.command
    import alembic.config
    from sqlalchemy import create_engine
    
    class MigrationPerformanceAnalyzer:
        """Analyze migration performance and resource usage."""
        
        def __init__(self, config_path: str):
            self.config = self._load_config(config_path)
            self.performance_data = []
            
        def analyze_all_migrations(self) -> Dict[str, Any]:
            """Analyze performance across all databases and migrations."""
            
            results = {
                'databases': {},
                'summary': {},
                'recommendations': []
            }
            
            for db_name, db_config in self.config['databases'].items():
                if db_name == 'redis':
                    continue
                    
                print(f"⚑ Analyzing {db_name} migration performance...")
                
                engine = create_engine(db_config['url'])
                db_results = self._analyze_database_migrations(engine, db_name, db_config)
                results['databases'][db_name] = db_results
                engine.dispose()
                
            # Generate summary and recommendations
            results['summary'] = self._generate_performance_summary(results['databases'])
            results['recommendations'] = self._generate_recommendations(results['databases'])
            
            return results
            
        def _analyze_database_migrations(self, engine, db_name: str, db_config: Dict) -> Dict[str, Any]:
            """Analyze migration performance for a specific database."""
            alembic_cfg = alembic.config.Config("alembic.ini")
            
            # Test scenarios: empty DB, with sample data, with large dataset
            scenarios = [
                {'name': 'empty_db', 'data_size': 0},
                {'name': 'sample_data', 'data_size': 1000},
                {'name': 'large_dataset', 'data_size': 10000}
            ]
            
            db_results = {
                'scenarios': {},
                'average_performance': {},
                'bottlenecks': []
            }
            
            for scenario in scenarios:
                scenario_results = self._test_migration_scenario(
                    engine, alembic_cfg, scenario, db_config
                )
                db_results['scenarios'][scenario['name']] = scenario_results
                
            return db_results
            
        def _test_migration_scenario(self, engine, alembic_cfg, scenario: Dict, db_config: Dict) -> Dict[str, Any]:
            """Test migration performance for a specific scenario."""
            
            # Reset database
            alembic.command.downgrade(alembic_cfg, "base")
            
            # Set up test data
            if scenario['data_size'] > 0:
                self._insert_test_data(engine, scenario['data_size'])
                
            # Get all migrations
            revisions = self._get_migration_revisions(alembic_cfg)
            
            scenario_results = {
                'migrations': [],
                'total_duration': 0,
                'peak_memory_usage': 0,
                'peak_cpu_usage': 0
            }
            
            # Monitor resource usage
            process = psutil.Process()
            
            for revision in revisions:
                migration_start = time.time()
                initial_memory = process.memory_info().rss / 1024 / 1024  # MB
                
                # Apply migration
                alembic.command.upgrade(alembic_cfg, revision)
                
                migration_duration = time.time() - migration_start
                final_memory = process.memory_info().rss / 1024 / 1024
                cpu_percent = process.cpu_percent()
                
                migration_result = {
                    'revision': revision,
                    'duration': migration_duration,
                    'memory_delta': final_memory - initial_memory,
                    'cpu_usage': cpu_percent,
                    'data_size': scenario['data_size']
                }
                
                scenario_results['migrations'].append(migration_result)
                scenario_results['total_duration'] += migration_duration
                scenario_results['peak_memory_usage'] = max(
                    scenario_results['peak_memory_usage'], 
                    final_memory
                )
                scenario_results['peak_cpu_usage'] = max(
                    scenario_results['peak_cpu_usage'],
                    cpu_percent
                )
                
                # Check performance thresholds
                max_duration = db_config.get('max_migration_time', 5000) / 1000
                if migration_duration > max_duration:
                    print(f"⚠️  Migration {revision} exceeded threshold: {migration_duration:.2f}s")
                    
            return scenario_results
  6. CI integration

    # Add to existing GitHub Actions workflow
    migration-testing:
      name: πŸ—„οΈ  Database Migration Testing
      runs-on: ubuntu-latest
      needs: [test]
      
      services:
        postgres:
          image: postgres:15
          env:
            POSTGRES_PASSWORD: test
            POSTGRES_USER: test
            POSTGRES_DB: migration_test
          options: >-
            --health-cmd pg_isready
            --health-interval 10s
            --health-timeout 5s
            --health-retries 5
        
        redis:
          image: redis:7
          options: >-
            --health-cmd "redis-cli ping"
            --health-interval 10s
            --health-timeout 5s
            --health-retries 5
    
      strategy:
        matrix:
          migration-scenario: 
            - "forward_migrations"
            - "rollback_testing" 
            - "performance_analysis"
            - "large_dataset_testing"
    
      steps:
        - name: ⬇️  Checkout source
          uses: actions/checkout@v4
          with:
            fetch-depth: 0
            
        - name: 🐍  Set up Python
          uses: actions/setup-python@v5
          with:
            python-version: "3.12"
            cache: pip
            
        - name: πŸ“¦  Install dependencies
          run: |
            python -m pip install --upgrade pip
            pip install -e .[dev]
            pip install pytest-xdist pytest-cov
            
        - name: πŸ—„οΈ  Run migration tests - ${{ matrix.migration-scenario }}
          run: |
            case "${{ matrix.migration-scenario }}" in
              "forward_migrations")
                make migration-test
                ;;
              "rollback_testing")
                make migration-rollback
                ;;
              "performance_analysis")
                make migration-perf
                ;;
              "large_dataset_testing")
                make migration-data-test
                ;;
            esac
            
        - name: πŸ“Š  Upload migration reports
          uses: actions/upload-artifact@v4
          with:
            name: migration-reports-${{ matrix.migration-scenario }}
            path: |
              tests/migrations/reports/
            retention-days: 30
            
        - name: 🚨  Migration Safety Check
          run: |
            make migration-safety
            
        - name: πŸ’¬  Comment PR with migration results
          if: github.event_name == 'pull_request' && matrix.migration-scenario == 'forward_migrations'
          uses: actions/github-script@v7
          with:
            script: |
              const fs = require('fs');
              const reportPath = 'tests/migrations/reports/migration-summary.md';
              if (fs.existsSync(reportPath)) {
                const report = fs.readFileSync(reportPath, 'utf8');
                github.rest.issues.createComment({
                  issue_number: context.issue.number,
                  owner: context.repo.owner,
                  repo: context.repo.repo,
                  body: `## πŸ—„οΈ Migration Test Results\n\n${report}`
                });
              }
  7. Migration safety checker

    # tests/migrations/safety_check.py
    #!/usr/bin/env python3
    """
    Migration safety analysis to prevent dangerous operations.
    """
    
    import re
    import ast
    from pathlib import Path
    from typing import List, Dict, Any
    
    class MigrationSafetyChecker:
        """Analyze migrations for potentially dangerous operations."""
        
        DANGEROUS_OPERATIONS = [
            'DROP TABLE',
            'DROP COLUMN', 
            'ALTER TABLE.*DROP',
            'TRUNCATE',
            'DELETE FROM.*WHERE.*IS NULL',  # Potential data loss
            'UPDATE.*SET.*NULL',  # Mass nullification
        ]
        
        HIGH_RISK_PATTERNS = [
            r'(?i)drop\s+table',
            r'(?i)drop\s+column',
            r'(?i)alter\s+table.*drop',
            r'(?i)truncate\s+table',
            r'(?i)delete\s+from\s+\w+\s*;',  # Delete without WHERE
        ]
        
        def check_all_migrations(self, migrations_dir: Path) -> Dict[str, Any]:
            """Check all migration files for safety issues."""
            
            safety_report = {
                'safe_migrations': [],
                'risky_migrations': [],
                'dangerous_migrations': [],
                'recommendations': []
            }
            
            migration_files = list(migrations_dir.glob('**/versions/*.py'))
            
            for migration_file in migration_files:
                safety_analysis = self._analyze_migration_file(migration_file)
                
                if safety_analysis['risk_level'] == 'safe':
                    safety_report['safe_migrations'].append(safety_analysis)
                elif safety_analysis['risk_level'] == 'risky':
                    safety_report['risky_migrations'].append(safety_analysis)
                else:  # dangerous
                    safety_report['dangerous_migrations'].append(safety_analysis)
                    
            # Generate recommendations
            safety_report['recommendations'] = self._generate_safety_recommendations(
                safety_report
            )
            
            return safety_report
            
        def _analyze_migration_file(self, migration_file: Path) -> Dict[str, Any]:
            """Analyze a single migration file for safety."""
            
            content = migration_file.read_text()
            
            analysis = {
                'file': str(migration_file),
                'revision': self._extract_revision(migration_file),
                'risk_level': 'safe',
                'issues': [],
                'warnings': [],
                'requires_review': False
            }
            
            # Check for dangerous SQL operations
            for pattern in self.HIGH_RISK_PATTERNS:
                matches = re.findall(pattern, content, re.MULTILINE | re.IGNORECASE)
                if matches:
                    analysis['issues'].append({
                        'type': 'dangerous_sql',
                        'pattern': pattern,
                        'matches': matches
                    })
                    analysis['risk_level'] = 'dangerous'
                    
            # Check for data modification without proper safeguards
            if self._has_data_modification_without_transaction(content):
                analysis['warnings'].append({
                    'type': 'no_transaction_safety',
                    'message': 'Data modification without explicit transaction handling'
                })
                if analysis['risk_level'] == 'safe':
                    analysis['risk_level'] = 'risky'
                    
            # Check for missing rollback implementation
            if not self._has_proper_downgrade(content):
                analysis['warnings'].append({
                    'type': 'missing_rollback',
                    'message': 'Downgrade function is empty or missing'
                })
                analysis['requires_review'] = True
                
            return analysis
  8. Data integrity validation

    # tests/migrations/data_integrity.py
    """
    Data integrity validation for migration testing.
    """
    
    import hashlib
    import json
    from typing import Dict, Any, List
    from sqlalchemy import create_engine, text
    from sqlalchemy.orm import sessionmaker
    
    class DataIntegrityValidator:
        """Validate data integrity before and after migrations."""
        
        def __init__(self, engine):
            self.engine = engine
            self.Session = sessionmaker(bind=engine)
            
        def capture_data_snapshot(self) -> Dict[str, Any]:
            """Capture a snapshot of current data state."""
            
            snapshot = {
                'tables': {},
                'row_counts': {},
                'data_checksums': {},
                'foreign_key_counts': {},
                'index_counts': {}
            }
            
            with self.Session() as session:
                # Get all table names
                tables = self._get_table_names(session)
                
                for table in tables:
                    # Row count
                    count_result = session.execute(text(f"SELECT COUNT(*) FROM {table}"))
                    snapshot['row_counts'][table] = count_result.scalar()
                    
                    # Data checksum (for small tables)
                    if snapshot['row_counts'][table] < 10000:
                        checksum = self._calculate_table_checksum(session, table)
                        snapshot['data_checksums'][table] = checksum
                        
                    # Foreign key relationships
                    fk_count = self._count_foreign_key_relationships(session, table)
                    snapshot['foreign_key_counts'][table] = fk_count
                    
            return snapshot
            
        def validate_migration_integrity(self, pre_snapshot: Dict, post_snapshot: Dict, 
                                       migration_revision: str) -> Dict[str, Any]:
            """Validate data integrity after migration."""
            
            validation_result = {
                'migration_revision': migration_revision,
                'passed': True,
                'issues': [],
                'warnings': [],
                'data_changes': {}
            }
            
            # Check for unexpected data loss
            for table, pre_count in pre_snapshot['row_counts'].items():
                post_count = post_snapshot['row_counts'].get(table, 0)
                
                if post_count < pre_count:
                    validation_result['issues'].append({
                        'type': 'data_loss',
                        'table': table,
                        'lost_rows': pre_count - post_count,
                        'message': f"Lost {pre_count - post_count} rows from {table}"
                    })
                    validation_result['passed'] = False
                    
                elif post_count > pre_count:
                    validation_result['warnings'].append({
                        'type': 'data_addition',
                        'table': table,
                        'added_rows': post_count - pre_count,
                        'message': f"Added {post_count - pre_count} rows to {table}"
                    })
                    
            # Check data checksums for unchanged tables
            for table, pre_checksum in pre_snapshot['data_checksums'].items():
                if table in post_snapshot['data_checksums']:
                    post_checksum = post_snapshot['data_checksums'][table]
                    
                    if pre_checksum != post_checksum and table not in validation_result['data_changes']:
                        validation_result['warnings'].append({
                            'type': 'data_modification',
                            'table': table,
                            'message': f"Data in {table} was modified during migration"
                        })
                        
            # Check foreign key integrity
            for table, pre_fk_count in pre_snapshot['foreign_key_counts'].items():
                post_fk_count = post_snapshot['foreign_key_counts'].get(table, 0)
                
                if post_fk_count != pre_fk_count:
                    validation_result['warnings'].append({
                        'type': 'foreign_key_change',
                        'table': table,
                        'change': post_fk_count - pre_fk_count,
                        'message': f"Foreign key relationships changed in {table}"
                    })
                    
            return validation_result
  9. Documentation integration

    Add migration testing documentation:

    # Database Migration Testing
    
    ## Overview
    
    Comprehensive testing pipeline for database migrations across SQLite, Postgres, and Redis.
    
    ## Running Migration Tests
    
    ```bash
    # Run complete migration test suite
    make migration-test
    
    # Test rollback scenarios
    make migration-rollback
    
    # Analyze migration performance  
    make migration-perf
    
    # Test with large datasets
    make migration-data-test
    
    # Cross-database compatibility
    make migration-cross-db
    
    # Safety checks before deployment
    make migration-safety

    Test Scenarios

    Scenario Description Databases
    Forward Migrations Test all migrations in sequence SQLite, Postgres
    Rollback Testing Validate rollback capabilities SQLite, Postgres
    Data Integrity Verify no data loss/corruption All
    Performance Analysis Measure migration execution time All
    Large Dataset Testing Test with 10K+ records SQLite, Postgres
    Redis Migrations Test key format changes Redis

    Safety Checks

    The migration safety checker identifies:

    • Dangerous operations (DROP TABLE, TRUNCATE)
    • Missing rollback implementations
    • Data modification without safeguards
    • High-risk SQL patterns
    
    
  10. Final validation

    # Test complete migration testing pipeline
    
    # 1. Set up test databases
    docker-compose -f docker-compose.test.yml up -d postgres redis
    
    # 2. Run all migration tests
    make migration-test
    make migration-rollback
    make migration-perf
    make migration-safety
    
    # 3. Test with large datasets
    make migration-data-test
    
    # 4. Cross-database compatibility
    make migration-cross-db
    
    # 5. Clean up
    make migration-clean
    
    # 6. Verify CI integration
    git add tests/migrations/
    git commit -m "Add comprehensive migration testing pipeline"
    git push  # Triggers CI with migration tests

πŸ“– References


🧩 Additional Notes

  • Zero-downtime migrations: Test migrations that can run alongside production traffic without blocking operations.
  • Data preservation: Comprehensive validation ensures no data loss during schema changes and rollbacks.
  • Performance monitoring: Track migration execution time to prevent slow deployments that impact availability.
  • Cross-database consistency: Ensure identical behavior across SQLite (development) and Postgres (production).
  • Redis migration support: Handle Redis key format changes and session data preservation.
  • Safety first: Automated detection of dangerous operations prevents accidental data loss in production.
  • CI integration: All migration tests run automatically to catch issues before deployment.

Migration Testing Best Practices:

  • Always test rollbacks with real data scenarios
  • Measure performance impact with production-scale datasets
  • Validate foreign key constraints and data relationships
  • Use database transactions for atomic migration operations
  • Test migration safety with automated dangerous operation detection
  • Document rollback procedures for emergency situations
  • Monitor resource usage during large migrations

Metadata

Metadata

Labels

choreLinting, formatting, dependency hygiene, or project maintenance chorescicdIssue with CI/CD process (GitHub Actions, scaffolding)devopsDevOps activities (containers, automation, deployment, makefiles, etc)testingTesting (unit, e2e, manual, automated, etc)triageIssues / Features awaiting triage

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions