Skip to content

Commit 0539927

Browse files
committed
minor reword
1 parent 933e525 commit 0539927

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

blog/2025-09-24_kvcache-wins-you-can-see.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ While Retrieval-Augmented Generation also relies on large prefixes (system promp
9797

9898
## **The Challenge of Scale-Out**
9999

100-
What happens when we move from single-instance to distributed production clusters? The once-unified KV-cache becomes **disaggregated**. Each vLLM pod manages its own cache in isolation. Standard load balancers spread traffic evenly using cache-blind metrics, scattering related requests across different pods and destroying cache locality.
100+
What happens when we move from single-instance environment to distributed production clusters? The once-unified KV-cache becomes **disaggregated**. Each vLLM pod manages its own cache in complete isolation. Standard load balancers naively spread traffic evenly using cache-blind metrics, scattering related requests across different pods and destroying cache locality.
101101

102102
Let's revisit our agentic workflow example to see the direct impact of being blind to this unmanaged, disaggregated cache:
103103

0 commit comments

Comments
 (0)