Finetune LLMs on your laptop’s GPU—no code, no PhD, no hassle.
- GPU-Powered Finetuning: Optimized for NVIDIA GPUs (even 4GB VRAM).
- One-Click Workflow: Upload data → Pick task → Train → Test.
- Hardware-Aware: Auto-detects your GPU/CPU and recommends models.
- React UI: No CLI or notebooks—just a friendly interface.
- Text-Generation: Generates answers in the form of text based on prior and fine-tuned knowledge. Ideal for use cases like customer support chatbots, story generators, social media script writers, code generators, and general-purpose chatbots.
- Summarization: Generates summaries for long articles and texts. Ideal for use cases like news article summarization, law document summarization, and medical article summarization.
- Extractive Question Answering: Finds the answers relevant to a query from a given context. Best for use cases like Retrieval Augmented Generation (RAG), and enterprise document search (for example, searching for information in internal documentation).
- Python==3.11.x: Ensure you have Python installed.
- NVIDIA GPU: Recommended VRAM >= 6GB.
- CUDA: Ensure CUDA is installed and configured for your GPU.
- HuggingFace Account: Create an account on Hugging Face and generate a finegrained access token.
-
Install the Package:
pip install modelforge-finetuning
-
Set HuggingFace API Key in environment variables:
Linux:export HUGGINGFACE_TOKEN=your_huggingface_token
Windows Powershell:
$env:HUGGINGFACE_TOKEN="your_huggingface_token"
Windows CMD:
set HUGGINGFACE_TOKEN=your_huggingface_token
Or use a .env file:
echo "HUGGINGFACE_TOKEN=your_huggingface_token" > .env
-
Install Appropriate CUDA version for PyTorch:
- Navigate to the PyTorch installation page and select the appropriate CUDA version for your system.
- Install PyTorch with the correct CUDA version. For example, for CUDA 12.6 on Windows, you can use:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
-
Run the Application:
modelforge run
-
Done!: Navigate to http://localhost:8000 in your browser and get started!
- Start the Application:
modelforge run
- Navigate to the App:
Open your browser and go to http://localhost:8000.
To stop the application and free up resources, press Ctrl+C
in the terminal running the app.
{"input": "Enter a really long article here...", "output": "Short summary."},
{"input": "Enter the poem topic here...", "output": "Roses are red..."}
ModelForge uses a modular configuration system for model recommendations. Contributors can easily add new recommended models by adding configuration files to the model_configs/
directory. Each hardware profile (low_end, mid_range, high_end) has its own configuration file where you can specify primary and alternative models for different tasks.
See the Model Configuration Guide for detailed instructions on how to add new model recommendations.
transformers
+peft
(LoRA finetuning)bitsandbytes
(4-bit quantization)React
(UI)FastAPI
(Backend)Python
(Backend)React.JS
(Frontend)