-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Milestone
Description
For the release, we intend to publish a technical blog on granular profiling of the distributed llm-d system:
- Scheduling behavior: scoring and indexing
- vLLM KV-Cache behavior: admissions and evictions
in cache-squeeze loads and how precise-prefix-cache awareness optimizes request placement.