-
Notifications
You must be signed in to change notification settings - Fork 225
Description
π§ Epic β Configuration Database for Dynamic Settings Management
Field | Value |
---|---|
Title | Database-Backed Configuration System with Hot-Reload Support |
Goal | Replace static .env configuration with a database-backed system that allows runtime configuration changes without restarts, while maintaining backwards compatibility and security |
Why now | Production deployments need dynamic configuration changes without container restarts. Multi-tenant environments require different settings per instance. Configuration drift across nodes causes operational issues. |
Depends on | Database migrations (Alembic), ConfigService layer, Admin API endpoints |
Depends on: #286
π§ Type of Feature
- Enhancement to existing functionality
- Developer tooling or test improvement
- Packaging, automation and deployment
πββοΈ User Story 1: Initial Configuration Population
As a: System Administrator
I want: The gateway to automatically populate database configuration from .env
on first start
So that: Existing deployments work without manual migration steps
β Acceptance Criteria
Scenario: First startup with existing .env file
Given a fresh database with no configuration table
And an existing .env file with configuration values
When the gateway starts for the first time
Then all .env values are imported into the configuration table
And the configuration table is marked as initialized
And the gateway uses database values for all subsequent requests
Scenario: Startup with existing configuration database
Given a populated configuration table
And an .env file with different values
When the gateway starts
Then database values take precedence over .env values
And no .env values overwrite database entries
πββοΈ User Story 2: Runtime Configuration Updates
As a: Platform Operator
I want: To modify configuration values through the Admin UI without restarting services
So that: I can tune performance and features in real-time based on load
β Acceptance Criteria
Scenario: Update configuration via Admin UI
Given I am authenticated as an admin
When I update a configuration value in the UI
Then the value is persisted to the database
And all gateway instances reload the configuration within 30 seconds
And an audit log entry is created with the change details
Scenario: Update configuration via API
Given I have admin API credentials
When I PUT to /admin/config/{key} with a new value
Then the API returns 200 with the updated configuration
And the change is propagated to all instances
And metrics show the configuration reload event
πββοΈ User Story 3: Configuration Validation & Security
As a: Security Auditor
I want: Sensitive configurations to remain immutable via UI/API
So that: Critical security settings cannot be accidentally modified
β Acceptance Criteria
Scenario: Attempt to modify locked configuration
Given a configuration key marked as "locked" (e.g., JWT_SECRET_KEY)
When I attempt to update it via UI or API
Then the request is rejected with 403 Forbidden
And an audit log entry shows the attempted modification
And the original value remains unchanged
Scenario: Configuration type validation
Given a configuration with type constraints
When I submit an invalid value type
Then the request is rejected with 422 Validation Error
And the error message explains the expected type/format
πββοΈ User Story 4: Configuration Export & Import
As a: DevOps Engineer
I want: To export and import configuration sets
So that: I can replicate environments and maintain configuration as code
β Acceptance Criteria
Scenario: Export current configuration
Given a configured gateway instance
When I request GET /admin/config/export
Then I receive a JSON file with all non-sensitive configurations
And the export includes metadata (version, timestamp, source)
Scenario: Import configuration set
Given a configuration export file
When I POST to /admin/config/import with the file
Then non-locked values are updated in the database
And a dry-run option shows what would change
And conflicts are reported before applying changes
π Design Sketch
flowchart TD
START[Gateway Start] --> CHECK{Config DB exists?}
CHECK -->|No| MIGRATE[Run Migration]
MIGRATE --> SEED[Seed from .env]
CHECK -->|Yes| LOAD[Load from DB]
SEED --> LOAD
LOAD --> CACHE[In-Memory Cache]
CACHE --> APP[Application Runtime]
UI[Admin UI] -->|GET/PUT| API[Config API]
CLI[mcpgateway config] -->|REST| API
API --> SERVICE[ConfigService]
SERVICE --> DB[(configuration table)]
DB -->|PostgreSQL NOTIFY| RELOAD[Config Watcher]
DB -->|Redis PubSub| RELOAD
DB -->|SQLite Polling| RELOAD
RELOAD -->|Update| CACHE
CACHE -->|Read| APP
SERVICE --> AUDIT[(config_audit table)]
classDef storage fill:#e1f5fe
classDef service fill:#fff3e0
class DB,AUDIT storage
class SERVICE,API service
π§ Implementation Tasks
-
Database Schema
- Create
configuration
table (key, value, type, source, locked, updated_at, created_at) - Create
config_audit
table (id, key, old_value, new_value, user_id, timestamp, source) - Add indexes for efficient key lookups and audit queries
- Create Alembic migration scripts
- Create
-
Configuration Service
- Implement
ConfigService
with get/set/delete/list operations - Add type validation using Pydantic schemas
- Implement configuration precedence (DB > ENV > defaults)
- Add configuration change notifications (DB-specific)
- Implement
-
Hot-Reload Mechanism
- PostgreSQL: LISTEN/NOTIFY implementation
- Redis: Pub/Sub channel for config changes
- SQLite: Polling mechanism with configurable interval
- Update in-memory cache on notification
-
Admin API Endpoints
-
GET /admin/config
- List all configurations -
GET /admin/config/{key}
- Get specific configuration -
PUT /admin/config/{key}
- Update configuration -
DELETE /admin/config/{key}
- Reset to default -
GET /admin/config/export
- Export configuration -
POST /admin/config/import
- Import configuration -
GET /admin/config/audit
- View audit trail
-
-
Admin UI Components
- Configuration list view with search/filter
- Inline edit with validation feedback
- JSON/YAML editor for complex values
- Audit log viewer with diff display
- Import/Export interface
-
CLI Commands
-
mcpgateway config list
- Show all configurations -
mcpgateway config get KEY
- Get specific value -
mcpgateway config set KEY VALUE
- Update value -
mcpgateway config export
- Export to file -
mcpgateway config import FILE
- Import from file
-
-
Security & Validation
- Mark sensitive keys as "locked" (JWT_SECRET_KEY, database passwords)
- Implement role-based access (only platform_admin can modify)
- Add rate limiting for configuration updates
- Validate value types and ranges
- Mask sensitive values in UI/API responses
-
Testing
- Unit tests for ConfigService operations
- Integration tests for hot-reload across instances
- Test configuration precedence rules
- Test audit logging functionality
- Performance tests for configuration lookups
-
Documentation
- Configuration management guide
- List of all configurable parameters
- Migration guide from .env to database
- Troubleshooting configuration issues
π Configuration Categories
Category | Examples | Reload Type | Locked |
---|---|---|---|
Security | JWT_SECRET_KEY, AUTH_REQUIRED | Restart Required | β Yes |
Database | DATABASE_URL, REDIS_URL | Restart Required | β Yes |
Performance | MAX_WORKERS, CACHE_TTL | Hot-Reload | β No |
Features | UI_ENABLED, FEDERATION_ENABLED | Hot-Reload | β No |
Limits | MAX_TOOL_TIMEOUT, REQUEST_SIZE_LIMIT | Hot-Reload | β No |
Logging | LOG_LEVEL, LOG_FORMAT | Hot-Reload | β No |
π Alternatives Considered
Option | Pros | Cons | Decision |
---|---|---|---|
Keep .env only | Simple, standard | Requires restarts, no audit trail | β |
Use external config service (Consul/etcd) | Distributed, battle-tested | Additional dependency, complexity | β |
File-based with file watching | Simple to implement | Not suitable for distributed deployments | β |
Database-backed with hot-reload | Audit trail, no restarts, works distributed | More complex implementation | β |
π Additional Context
- Backwards Compatibility: System must work with existing .env files during transition period
- Performance: Configuration lookups must not impact request latency (<1ms overhead)
- Reliability: Failed configuration updates must not crash the service
- Observability: Add metrics for configuration reload events and failures
- Rollback: Support reverting to previous configuration values via audit log
Success Metrics:
- Zero-downtime configuration changes
- 90% reduction in configuration-related restarts
- Complete audit trail for compliance
- <30 second propagation time across all instances