Skip to content

Conversation

@njhames
Copy link
Collaborator

@njhames njhames commented Aug 12, 2025

MLflow 3.1.0 Models-from-Code Migration for Fine-Tuning with ORPO Blueprint

Overview

Successfully migrated the Fine-Tuning with ORPO blueprint from MLflow's legacy serialization-based model logging (python_model) to the modern models-from-code approach (loader_module + data_path). This comprehensive architectural refactoring resolves critical MLflow 3.1.0 compatibility issues while maintaining complete API compatibility and improving code architecture.

🎯 LATEST UPDATE: Completed universal structure standardization with full adoption of the new generic layout (src/mlflow/) using standardized class names Model and Logger. All legacy blueprint-specific prefixes have been eliminated, and the file structure now follows the universal AI blueprints standard.

Evidence

🤖 Fine Tuning with Orpo.pdf

Summary of Changes

  • Primary Purpose: Eliminate MLflow 3.1.0 serialization errors and modernize deployment architecture
  • Technical Approach: Clean separation of concerns through models-from-code pattern with standalone model classes
  • Structure Standardization: COMPLETED - Full universal layout adoption (src/mlflow/) with generic class names (Model, Logger)
  • Scope: Complete architectural migration affecting model loading, logging, and deployment workflows

Technical Changes

Universal Structure Standardization ✅ COMPLETED

File Structure Migration

BEFORE (Legacy Structure):
core/fine_tuning_service/
├── fine_tuning_model.py      # Class: FineTuningModel
├── fine_tuning_loader.py     # Loader for MLflow
├── fine_tuning_service.py    # Class: FineTuningService
└── __init__.py

AFTER (Universal Structure) - FULLY IMPLEMENTED:
src/mlflow/
├── __init__.py               # Exports: Model, Logger (dynamic imports)
├── model.py                  # Class: Model (formerly FineTuningModel)
├── loader.py                 # MLflow models-from-code loader (formerly fine_tuning_loader.py)
└── logger.py                 # Class: Logger (formerly FineTuningService)

Generic Class Names - APPLIED

  • Model (formerly FineTuningModel): Framework-agnostic fine-tuning comparison logic layer
  • Logger (formerly FineTuningService): MLflow registration and artifact management layer
  • Universal imports: from src.mlflow import Model, Logger
  • Standardized loader module: loader_module="src.mlflow.loader"
  • Updated notebooks: All imports changed from core.fine_tuning_service to src.mlflow

Implementation Changes

  • ✅ All three core files renamed: fine_tuning_model.pymodel.py, fine_tuning_loader.pyloader.py, fine_tuning_service.pylogger.py
  • ✅ Class definitions updated: FineTuningModelModel, FineTuningServiceLogger
  • ✅ Import statements refactored throughout codebase
  • ✅ Loader module reference updated in log_model() call
  • ✅ Notebooks updated with new import structure: from src.mlflow import Logger
  • ✅ Package __init__.py rewritten with dynamic exports and lazy loading
  • ✅ Legacy core/fine_tuning_service/ directory removed

New Architecture Components

src/mlflow/loader.py (Universal Structure)

  • Purpose: MLflow models-from-code entry point implementing the required _load_pyfunc() function
  • Functionality:
    • Loads configuration, base model, and fine-tuned model artifacts from MLflow
    • Handles proper artifact directory structure validation
    • Returns initialized Model instance for prediction
  • Integration: Called automatically by MLflow during model loading and deployment
  • Import Reference: from src.mlflow.model import Model

src/mlflow/model.py (Universal Structure)

  • Purpose: Standalone business logic layer with zero MLflow dependencies
  • Architecture: Framework-agnostic model class designed for testability and maintainability
  • Functionality:
    • Complete fine-tuning comparison pipeline: dynamic model loading, prediction interface
    • Supports all model sources: local directories, HuggingFace model IDs
    • Maintains identical predict(model_input, params) API signature for backward compatibility
    • Handles adaptive LLM comparison with memory management and efficient model switching
  • Design Pattern: Clean separation between business logic and MLflow integration concerns

Refactored Service Architecture

src/mlflow/logger.py (Universal Structure)

  • Role Transformation: Pure MLflow registration layer following canonical pattern
  • Architectural Changes:
    • Eliminated MLflow inheritance dependencies (PythonModel removed)
    • Streamlined log_model() method using loader_module approach exclusively
    • Implemented elegant artifact organization with proper temporary directory management
    • Updated to universal loader module reference: loader_module="src.mlflow.loader"
  • Artifact Management:
    /artifacts/data/
      ├── config.yaml               # Model configuration
      ├── model_no_finetuning/      # Base model directory
      ├── finetuned_model/          # Fine-tuned model directory
      └── demo/                     # UI components (optional)
    

Package Structure Enhancement

  • src/mlflow/__init__.py: Universal module initialization with dynamic imports
    __all__ = ["Model", "Logger"]
    
    def __getattr__(name):
        if name == "Model":
            from .model import Model
            return Model
        if name == "Logger":
            from .logger import Logger
            return Logger
        raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
  • Notebooks Updated: All imports changed from from core.fine_tuning_service import FineTuningService to from src.mlflow import Logger
  • Clean API: Dynamic lazy loading for optimal performance

Configuration & Environment Changes

Configuration Updates

  • Notebooks: Updated model path resolution to use configuration-driven approach
  • Path Resolution: Enhanced model path handling for both local directories and HuggingFace model IDs
  • code_paths parameter: Updated from ["../core", "../src"] to ["../src"] only ✅

Utility Function Integration

  • Path Resolution: Enhanced integration with src.utils helper functions:
    • get_project_root(): Project root path resolution
    • get_models_dir(): Models directory resolution
    • get_fine_tuned_models_dir(): Fine-tuned models directory resolution
  • Environment Integration: Improved support for container environments with proper path handling

Implementation Details

Architecture Impact

  • Design Pattern: Clean layered architecture with separation of concerns
    • Registration Layer: Logger (MLflow integration only)
    • Business Logic Layer: Model (framework-agnostic core functionality)
    • Loader Layer: loader (MLflow deployment interface)
  • Integration Points: Maintained identical external API for seamless migration
  • Performance Considerations: Eliminated serialization overhead, improved model loading efficiency
  • Universal Structure: Standardized naming eliminates blueprint-specific complexity ✅

Code Organization

  • File Structure Changes:
    • Added: src/mlflow/ package with universal structure (model.py, loader.py, logger.py) ✅
    • Added: Standalone model class with comprehensive fine-tuning comparison functionality
    • Removed: Legacy core/fine_tuning_service/ blueprint-specific structure ✅
    • Modified: Notebooks updated for universal imports and generic class names ✅
  • Module Interactions: Clean imports with explicit dependency management and generic class names
  • Data Flow: Streamlined artifact handling through temp directory organization

Error Resolution Strategy

  • Serialization Issues: Eliminated cloudpickle dependency conflicts with fine-tuning frameworks
  • MLflow Compatibility: Full MLflow 3.1.0 support through models-from-code pattern
  • Path Resolution Issues: Robust path handling for both local and HuggingFace model sources
  • Error Handling: Comprehensive exception handling with detailed logging throughout initialization
  • Fallback Mechanisms: Graceful degradation for missing optional components (demo assets, model files)

Testing Strategy

Manual Testing

  • Test Scenarios:
    • Model registration with various fine-tuning configurations using Logger.register_model()
    • Model loading and prediction across different model sources
    • Deployment validation through notebook execution with new import structure ✅
    • Adaptive model comparison with base vs fine-tuned model switching

Quality Assurance

Code Quality

  • Code Style: Consistent with repository standards, comprehensive docstrings, proper type hints
  • Documentation: Clear architectural layer responsibilities, detailed function documentation
  • Error Handling: Robust exception management with informative error messages and logging
  • Universal Structure: Standardized layout improves maintainability across blueprints ✅

Performance Impact

  • Model Loading: Faster initialization due to eliminated serialization overhead
  • Memory Usage: Reduced memory footprint by removing unnecessary inheritance
  • Deployment Time: Improved deployment reliability with models-from-code approach

Validation Results ✅

  • ✅ No legacy prefixed filenames (_model.py, _loader.py, _service.py) remain in src/
  • ✅ No prefixed class names (FineTuningModel, FineTuningService) found in src/mlflow/
  • ✅ No references to core.fine_tuning_service or core/fine_tuning_service in codebase
  • ✅ All imports updated to from src.mlflow import Model, Logger
  • ✅ Loader module correctly references src.mlflow.loader

Review Guidelines

Critical Review Areas

  1. MLflow Integration: Verify loader.py correctly implements models-from-code pattern with universal imports ✅
  2. API Compatibility: Confirm Model.predict() maintains identical signature and behavior ✅
  3. Artifact Handling: Validate proper organization and cleanup of temporary directories ✅
  4. Configuration Management: Review model path resolution and environment variable handling ✅
  5. Error Scenarios: Test behavior with missing or invalid artifacts/configurations
  6. Universal Structure: Confirm generic class names and imports work correctly ✅

Testing Instructions

  1. Register Model: Run notebooks/register-model.ipynb to validate new logging approach with universal structure
  2. Load and Test: Verify model loads correctly in MLflow UI and responds to API calls
  3. Deploy Validation: Confirm adaptive model comparison functionality with base/fine-tuned switching
  4. Migration Comparison: Compare before/after behavior for identical input scenarios
  5. Notebook Validation: Execute notebooks to verify new import structure: from src.mlflow import Logger

Deployment Considerations

  • Rollback Procedure: Previous python_model approach is incompatible with models-from-code
  • Environment Setup: Ensure MLflow 3.1.0 compatibility in target deployment environments
  • Dependencies: Verify fine-tuning framework compatibility (transformers, peft, torch)
  • Universal Structure: New standardized layout (src/mlflow/) provides consistency across all blueprints ✅

Commit History Summary

The development progression demonstrates systematic architectural migration:

  1. Initial Implementation: Created foundational models-from-code structure
  2. Business Logic Extraction: Developed standalone Model with full fine-tuning comparison functionality
  3. Service Layer Refactoring: Simplified Logger to pure registration responsibilities
  4. Universal Structure Migration: COMPLETED - Applied standardized layout with generic class names and universal imports ✅
  5. Final Refinement: Code cleanup, validation, and canonical pattern alignment

Breaking Changes

None - This migration maintains complete API compatibility:

  • ✅ Identical model signature and parameter schema
  • ✅ Unchanged notebook interfaces (imports updated to universal structure: from src.mlflow import Logger)
  • ✅ Same prediction API behavior and response formats
  • ✅ Compatible demo and UI components
  • ✅ Universal structure provides consistency across blueprints

Future Considerations

Technical Debt Resolution

  • Testing Coverage: Expand automated test suite for edge cases and error scenarios
  • Documentation: Update architecture diagrams and deployment guides

Blueprint Migration Template

This implementation provides a reusable migration pattern for other AI blueprints:

  1. Create src/mlflow/ structure with loader.py, model.py, logger.py
  2. Develop Model class without MLflow inheritance in model.py
  3. Update Logger.log_model() to use loader_module="src.mlflow.loader"
  4. Rename classes to generic names: Model, Logger (no blueprint prefixes) ✅
  5. Update all imports from core.{service} to src.mlflow
  6. Add comprehensive error handling and logging throughout ✅
  7. Maintain API compatibility through identical method signatures ✅

Migration Status:COMPLETE with Universal Structure - Ready for Final Review

This comprehensive architectural migration successfully modernizes the Fine-Tuning with ORPO blueprint for MLflow 3.1.0 and fully implements the universal structure standard (src/mlflow/ with generic class names Model and Logger) that provides consistency, maintainability, and ease of understanding across all AI blueprints.

• Extract business logic from LLMComparisonModel to FineTuningModel
• Create FineTuningLoader for MLflow models-from-code integration
• Transform FineTuningService to registration-only service
• Update register-model notebook to use new FineTuningService
• Add model_path configuration to config.yaml
• Add get_model_path function to src/utils.py
• Update requirements.txt with MLflow 3.1.0
• Remove legacy BaseGenerativeService architecture
• Maintain backward compatibility with legacy register_llm_comparison_model

This migration resolves MLflow 3.1.0 serialization issues while preserving
all existing functionality through clean architectural separation.
@github-actions github-actions bot added enhancement Improvements to existing features dependencies Pull requests that update a dependency file python Pull requests that update python code labels Aug 12, 2025
…at/mlflow-models-from-code-migration-fine-tuning-with-orpo
@NickyJhames NickyJhames changed the title Feat/mlflow models from code migration for fine-tunning with orpo bp feat: Standardize fine-tuning-with-orpo blueprint to universal src/mlflow structure Sep 2, 2025
@NickyJhames NickyJhames changed the title feat: Standardize fine-tuning-with-orpo blueprint to universal src/mlflow structure feat: MLflow 3.1.0 Models-from-Code Migration for Fine-Tuning with ORPO Blueprint Sep 2, 2025
@ata-turhan ata-turhan self-requested a review September 4, 2025 20:02
Copy link
Collaborator

@ataturhan-hp ataturhan-hp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notebook testing is successful, but the deployment fails:

run-workflow.ipynb
register-model.ipynb

This is the error we see on logs:

�[0;34m######################## Model Service Initialization ########################�[0m
�[0;34mℹ️ Starting Phoenix Model Service container setup...�[0m

�[0;34m######################## Configuration Loading ########################�[0m
�[0;34mℹ️ Artifact path: models:/m-e5cfecbdb155408ab37b3d2950f2556f�[0m
�[0;34mℹ️ Model registry URI detected, extracting model ID: m-e5cfecbdb155408ab37b3d2950f2556f�[0m
�[0;34mℹ️ Using artifact path: /phoenix/mlflow/840339209521334209/models/m-e5cfecbdb155408ab37b3d2950f2556f�[0m
�[0;31m❌ Configuration file not found. Expected: /phoenix/mlflow/840339209521334209/models/m-e5cfecbdb155408ab37b3d2950f2556f/artifacts/data/model_artifacts/config.yaml�[0m

@ata-turhan ata-turhan force-pushed the feat/mlflow-models-from-code-migration-fine-tuning-with-orpo branch from 0df94bc to 2d58fa8 Compare September 25, 2025 15:25
Nicky Souza and others added 7 commits October 14, 2025 15:21
…at/mlflow-models-from-code-migration-fine-tuning-with-orpo
- Move core/fine_tuning_service/* → src/mlflow/*
- Rename fine_tuning_*.py → *.py (generic names)
- Update class names: FineTuningModel → Model, FineTuningService → Logger
- Update imports and loader_module references to src.mlflow
- Update notebooks: from core.fine_tuning_service → from src.mlflow
- Maintain full API compatibility
…-orpo' of https://github.com/HPInc/AI-Blueprints into feat/mlflow-models-from-code-migration-fine-tuning-with-orpo
@NickyJhames NickyJhames self-assigned this Oct 14, 2025
gabriela-ponciano and others added 2 commits October 14, 2025 19:58
@njhames njhames changed the title feat: MLflow 3.1.0 Models-from-Code Migration for Fine-Tuning with ORPO Blueprint feat: [GEN-AI] MLflow 3.1.0 Models-from-Code Migration for Fine-Tuning with ORPO Blueprint Nov 3, 2025
@njhames njhames requested a review from ataturhan-hp November 3, 2025 17:16
@njhames njhames marked this pull request as ready for review November 3, 2025 17:16
@njhames njhames changed the base branch from main to v2.0.0 November 3, 2025 17:48
@njhames njhames merged commit c1ccf8b into v2.0.0 Nov 3, 2025
2 of 3 checks passed
@njhames njhames deleted the feat/mlflow-models-from-code-migration-fine-tuning-with-orpo branch November 3, 2025 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file enhancement Improvements to existing features python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants