Visit
- The Project Overview
- The System Architecture
- The Inference
- Quick Start
- Tuning
- Deployment
- Package Management
- Data CI/CD Automation
- Contributing
- Trouble Shooting
- Ref. Repository Structure
This project describes the development and deployment of a serverless machine learning system designed to recommend optimal retail pricing which maximizes product sales.
The system aims to allow mid-sized retailers to compete effectively with larger players.
The architecture establishes a scalable, serverless microservice using AWS Lambda, triggered by an API Gateway.
The prediction logic is fully containerized via Docker, which stored in AWS ECR.
Trained models and features are centrally managed in S3, while ElastiCache (Redis) provides a low-latency caching layer for historical data and predictions.
This event-driven setup ensures automatic scaling and pay-per-use efficiency.
Figure A. The system architecture (Created by Kuriko IWAI)
The infrastructure leverages AWS ecosystem:
- 
Docker / AWS ECR as Microservice container: Packages the prediction logic and dependencies. AWS Lambda pulls the image from ECR for consistent, universal deployment. 
- 
AWS API Gateway as REST API endpoint: Routes external client-side UI requests (via a Flask application) to trigger the Lambda function. 
- 
AWS Lambda as inference: Executes the inference function, loading the container, models, and features to calculate price recommendations. 
- 
AWS S3 as storage & feature store: Stores raw features, trained model artifacts, processors, and DVC metadata for ML Lineage. 
- 
AWS ElastiCache and Redis client as caching layer: Stores cached analytical data and past price predictions to improve latency and resource efficiency. 
A dedicated ML Lineage process is integrated using DVC (Data Version Control) and scheduled by Prefect, an open-source workflow scheduler, running weekly.
- 
Lineage Scope (DVC): DVC tracks the entire lifecycle, including Data (ETL/preprocessing), Experiments (hyperparameter tuning/validation), and Models/Prediction (artifacts, metrics). 
- 
Data Quality Gate: Models must pass stringent quality checks before being authorized to serve predictions: - 
Data Drift Tests: Handled by Evently AI to identify shifts in data distribution. 
- 
Fairness Tests: Measures SHAP scores and other custom metrics to ensure the model operates without bias. 
 
- 
- 
Automation: Prefect triggers DVC weekly to check for updates in data or scripts and executes the full lineage process if changes are detected, ensuring continuous model freshness and quality. 
The infrastructure and model lifecycle are managed through a robust MLOps practice using a CI/CD pipeline integrated with GitHub.
- 
Code Lineage: Handled by GitHub, protected by branch rules and enforced pull request reviews. 
- 
Source: Code commit to GitHub triggers a GitHub Actions workflow. 
- 
Testing & Building: Automated GitHub Actions run: - 
Test Phase: Runs PyTest (unit/integration tests), SAST (Static Application Security Testing), and SCA (Software Composition Analysis) for dependencies using Synk. 
- 
Build Phase: If tests pass, AWS CodeBuild is triggered to build the Docker image and push it to ECR. 
 
- 
- 
Deployment: A human review phase is mandatory between the build and deployment. After approval, another GitHub Actions workflow is manually triggered to deploy the updated Lambda function to staging or production. 
Figure B. The CI/CD pipeline (Created by Kuriko IWAI)
The process is designed for consistent, automated data and model management through MLOps tools:
- 
The client UI sends a price recommendation request via the Flask application. 
- 
The request hits the API Gateway endpoint. 
- 
API Gateway triggers the AWS Lambda function. 
- 
Lambda loads the Docker container from ECR. 
- 
The function retrieves the latest features and model artifacts from S3 and checks ElastiCache/Redis for cached data. 
- 
The primary model performs inference on the logarithmically transformed quantity data and returns the optimal price recommendation. 
The system utilizes multiple machine learning models to ensure prediction redundancy and reliability. The primary mechanism involves predicting the quantity of product sold at a given price point.
- 
Primary Model: Multi-layered feedforward network (PyTorch). - 
Role: Serves first-line predictions. 
- 
Tuning: Tuned via Optuna's Bayesian Optimization (with grid search fallback). 
 
- 
- 
Backup Models: LightGBM, SVR, and Elastic Net (Scikit-Learn). - 
Role: Prioritized backups used if the primary model fails, ensuring redundancy. 
- 
Tuning: Tuned via the Scikit-Optimize framework. 
 
- 
Models are evaluated using metrics corresponding to both transformed and original data, where a lower value indicates better performance.
- 
For Logged Values: Mean Squared Error (MSE). 
- 
For Actual (Original) Values: Root Mean Squared Log Error (RMSLE) and Mean Absolute Error (MAE). 
- 
Logarithmic Transformation (Data Preprocessing): - Quantity data is logged before training and prediction to achieve a denser data distribution. This is crucial for normalizing skewed data and reducing the influence of extreme values (outliers), enabling all models to learn underlying patterns more effectively.
 
- 
Model Diversity and Redundancy: - 
The system employs a hybrid approach combining a Multi-layered Feedforward Network (Deep Learning) as the primary predictor with diverse Traditional Machine Learning Models (LightGBM, SVR, Elastic Net) as backups. 
- 
This multi-model inference strategy provides a failover mechanism, ensuring high availability by loading a prioritized backup model if the primary fails. 
 
- 
- 
Advanced Hyperparameter Optimization: - 
Bayesian Optimization (Optuna) is utilized for the deep learning primary model, efficiently searching the hyperparameter space to find optimal settings (with a grid search fallback available). 
- 
The backup Scikit-Learn models are tuned using the Scikit-Optimize framework. 
 
- 
- 
Production Quality Gates: - 
To ensure the model remains reliable in a dynamic retail environment, the ML Lineage process incorporates necessary quality checks as techniques: - 
Data Drift Testing (Evently AI): Continuously identifies shifts in data distributions in production that could compromise the model's generalization capabilities. 
- 
Fairness Testing: Validates that the model operates without unwanted bias across different features or segments before being authorized to serve predictions. 
 
- 
 
- 
For MacOS:
brew install uvFor Ubuntu/Debian:
sudo apt-get install uvuv venv
source .venv/bin/activate
uv lock --upgrade
uv syncor
pip env
pip install -r requirements.txt- AssertionError/module mismatch errors: Set up the default Python version using .pyenv
pyenv install 3.12.8
pyenv global 3.12.8  (optional: `pyenv global system` to get back to the system default ver.)
uv python pin 3.12.8
echo 3.12.8 >> .python-versionCreate .env file in the project root and add secret vars following .env.sample file.
uv run app.py --cache-clearThe API endpoint is available at http://localhost:5002.
- 
The data_handlingfolder contains data relerated scripts.
- 
After updating scripts, run: 
uv run src/data_handling/main.py- The retrain script will load the serialized model in the model store, then retrain with new data, and upload the retrained model to the model store.
uv run src/retrain.py- The main script will run feature engineering and model tuning from scratch, and update instances saved in model store and feature store in S3.
uv run src/main.py- Before running the script, make sure testing the new script in notebook.
- Run the main script for stockcode to tune the model based on training data of specific stockcode.
uv run src/main_stockcode.py {STOCKCODE} --cache-clear- Build and run Docker image:
docker build -t <APP NAME> .
docker run -p 5002:5002 -e ENV=local <APP NAME> app.pyReplace <APP NAME> with an app name of your choice.
- Push the Dokcer image to AWS Elastic Container Registory (ECR)
# tagging
docker tag <YOUR ECR NAME>:<YOUR ECR VERSION> <URI>.dkr.ecr.<REGION>.amazonaws.com/<ECR NAME>:<VERSION>
# push to the ECR
docker push <URI>.dkr.ecr.<REGION>.amazonaws.com/<ECR NAME>:<VERSION>- 
Cache storage (ElastiCache) run on Redis engine. 
- 
To test the connection locally: 
redis-cli --tls -h clustercfg.{REDIS_CLUSTER}.cache.amazonaws.com -p 6379 -c- To flush all caches (WITH CAUTION):
redis-cli -h clustercfg.{REDIS_CLUSTER}.cache.amazonaws.com -p 6379 --tls
# once connected, flush all data
FLUSHALL
# or flush specific database (if using multiple databases)
FLUSHDB- Add a package: uv add <package>
- Remove a package: uv remove <package>
- Run a command in the virtual environment: uv run <command>
- To completely refresh the environement:
rm -rf .venv
rm -rf uv.lock
uv cache clean
uv venv
source .venv/bin/activate
uv sync- Run the DVC pipeline and push the updated data to cache:
dvc repro
# add updated lock file
git add dvc.lock
git commit -m'updated'
git push
# dvc push
dvc push- Force run all stages in the DVC pipeline including stages without any updates:
dvc repro -f- Run the DVC pipeline for a specific stockcode:
dvc repro etl_pipeline_stockcode -p stockcode={STOCKCODE}
dvc repro preprocess_stockcode -p stockcode={STOCKCODE}- Train the model using data from the DVC pipeline:
uv run src/main_stockcode.py {STOCKCODE}
dvc add models/production/dfn_best_{STOCKCODE}.pth
dvc push
rm models/production/dfn_best_{STOCKCODE}.pth- To check the cache status explicitly:
dvc data status --not-in-remote- To edit the DVC pipeline, update dvc.yamlandparams.yamlfor parameter updates.
- Run Prefect server in local
uv run prefect server startexport PREFECT_API_URL="http://127.0.0.1:4200/api"- Deploy the weekly DVC pipeline run (from the Docker container)
uv run src/prefect_flows.py- Test run the Prefect worker
# add a user group USER to the docker
sudo dscl . -append /Groups/docker GroupMembership $USER
prefect worker start --pool <YOUR-WORKER-POOL-NAME>- Create a flow run for deployment.
prefect deployment run 'etl-pipeline/deploy-etl-pipeline'Ref.
- Prefect dashboard: http://127.0.0.1:4200/dashboard.
- Prefect official documentation - deploy via Python
- 
Create your feature branch ( git checkout -b feature/your-amazing-feature)
- 
Create a feature. 
- 
Pull the latest version of source code from the main branch ( git pull origin main) *Address conflicts if any.
- 
Commit your changes ( git add ./git commit -m 'Add your-amazing-feature')
- 
Push to the branch ( git push origin feature/your-amazing-feature)
- 
Open a pull request 
- Flag #REFINEMEfor any improvement needed and#FIXMEfor any errors.
Pre-commit hooks runs hooks defined in the pre-commit-config.yaml file before every commit.
To activate the hooks:
- Install pre-commit hooks:
uv run pre-commit install- Run pre-commit checks manually:
uv run pre-commit run --all-filesPre-commit hooks help maintain code quality by running checks for formatting, linting, and other issues before each commit.
- To skip pre-commit hooks
git commit --no-verify -m "your-commit-message"Common issues and solutions:
- 
API key errors: Ensure all API keys in the .envfile are correct and up to date. Make sure to addload_dotenv()on the top of the python file to apply the latest environment values.
- 
Data warehouse connection issues: Check logs on AWS consoles, CloudWatch. Check if .envand Lambda's environment configuration are correct.
- 
Memory errors: If processing large contracts, you may need to increase the available memory for the Python process. 
- 
Issues related to Python quit unexpectedly: Check this stackoverflow article.
- 
reportMissingImportserror from pyright after installing the package: This might occur when installing new libraries while VSCode is running. Open the command pallete (ctrl + shift + p) and run the Python: Restart language server task.
.
└── .venv/              [.gitignore]    # stores uv venv
│
└── .github/                            # infrastructure ci/cd
│
└── .dvc/                               # dvc folder - cache, tmp, config
│
└── data/               [dvc track]     # version tracked by dvc
└── preprocessors/      [dvc track]     # version tracked by dvc
└── models/                             # stores serialized model after training and tuning
│     └──dfn/                           # deep feedforward network
│     └──gbm/                           # light gbm
│     └──en/                            # elastic net
│     └──production/    [dvc track]     # models to be stored in S3 for production use
└── reports/            [dvc track]     # reports on data drift, shap values
└── metrics/            [dvc track]     # model evaluation metrics (mae, mse, rmsle)
|
└── notebooks/                          # stores experimentation notebooks
│
└── src/                                # core functions
│     └──_utils/                        # utility functions
│     └──data_handling/                 # functions to engineer features
│     └──model/                         # functions to train, tune, validate models
│     │     └── sklearn_model
│     │     └── torch_model
│     │     └── ...
│     └──main.py                        # main script to preform inference locally (without dvc repro)
│
└── app.py                              # flask application (API endpoints)
│
└── tests/                              # pytest scripts and config
└── pytest.ini
│
└── pyproject.toml                      # project config
│
└── .env                [.gitignore]    # environment variables
│
└── uv.lock                             # dependency locking
│
└── .python-version                     # python version locking (3.12)
│
└── Dockerfile.lambda.local             # docker config
└── Dockerfile.lambda.production
└── .dockerignore
└── requirements.txt
│
└── dvc.yaml                            # dvc pipeline config
└── params.yaml
└── .dvcignore
└── dvc.lock
│
└── .pre-commit-config.yaml             # pre-commit check config
└── .synk                               # synk (dependency and code scanning) config
All images and contents, unless otherwise noted, are by the author.
![[Figure A. The system architecture (Created by Kuriko IWAI)]](https://cdn.hashnode.com/res/hashnode/image/upload/v1760860401076/ae657214-fe63-4de0-a9ca-a1033bff2907.png)
![[Figure B. The CI/CD pipeline (Created by Kuriko IWAI)]](https://cdn.hashnode.com/res/hashnode/image/upload/v1760860467508/896750af-22d3-45ca-9fc1-25b96f77ab0b.png)