A streamlined chat application that leverages Docker Model Runner to serve Large Language Models (LLMs) through a modern Streamlit interface. This project demonstrates containerized LLM deployment with a user-friendly web interface.
- 🤖 Modern chat interface with message history
- ⚡️ Real-time LLM responses
- 🔄 Clear chat functionality
- 🛡️ Error handling and loading states
- 🐳 Containerized deployment
- 🏥 Health monitoring
- Frontend: Streamlit
- Backend: Docker Model Runner
- API: OpenAI-compatible endpoints
- Containerization: Docker & Docker Compose
- Language: Python 3.11
- Key Dependencies: OpenAI SDK, python-dotenv
DMR/
├── app/
│ ├── Dockerfile # Frontend container configuration
│ ├── main.py # Streamlit chat interface
│ └── requirements.txt # Python dependencies
├── docker-compose.yml # Service orchestration
├── backend.env # Environment configuration
└── README.md # Project documentation
- Docker Desktop
- Git
-
Clone the repository
git clone <repository-url> cd DMR
-
Configure environment Create
backend.env
with:BASE_URL=http://host.docker.internal:12434/engines/llama.cpp/v1/ API_KEY=your_api_key MODEL=your_model_name LLM_MODEL_NAME=ai/smollm2:latest
-
Build and run
docker compose up --build
-
Access the application
- Open http://localhost:8501
- Start chatting!
- Built from Python 3.11-slim
- Exposes port 8501
- Includes health checking
- Dependencies managed via requirements.txt
- Uses Docker Model Runner
- Default model: ai/smollm2:latest
- Configurable via environment variables
- Isolated network for services
- Bridge driver for container communication
- Internal service discovery
-
Create virtual environment
python -m venv venv source venv/bin/activate # On Mac
-
Install dependencies
pip install -r app/requirements.txt
-
Run locally
streamlit run app/main.py
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request