minor fix

vMaroon · vMaroon · commit 1d38367f154a · 2025-09-24T01:34:19.000+03:00
diff --git a/blog/2025-09-24_kvcache-wins-you-can-see.md b/blog/2025-09-24_kvcache-wins-you-can-see.md
@@ -271,19 +271,19 @@ The journey of llm-d reflects a broader shift in how we think about LLM inferenc
 By moving from AI-blind routing to a precise, KV-cache aware strategy, **we can unlock order-of-magnitude improvements in latency and throughput on the exact same hardware**. The well-lit path of precise prefix-cache awareness offers a tested, benchmarked solution to make your distributed deployments dramatically more efficient.
 
 :::tip Choosing the Right Strategy
-The optimal scheduler depends on the complexity of the workload. Below is a hierarchy of common strategies, where each level addresses the limitations of the one before it.
+The optimal scheduler depends on the complexity of the workload. Below is a hierarchy of supported strategies, where each level addresses the limitations of the one before it.
 
 * **1. Random/Round-Robin Scheduling**
   This simplest approach works well for symmetric workloads where all requests have similar computational costs and minimal cache reuse.
-  * **Its Limitation:** It creates load imbalance when workloads are asymmetric
+  * **Limitation:** It creates load imbalance when workloads are asymmetric
 
 * **2. Load-Aware Scheduling**
   The necessary next step for asymmetric workloads. By routing requests based serving capacity, it prevents overload and improves resource utilization.
-  * **Its Limitation:** It cannot exploit caching opportunities, resulting in redundant computation.
+  * **Limitation:** It cannot exploit caching opportunities, resulting in redundant computation.
 
 * **3. Approximate Prefix-Cache Scheduling**
   This strategy introduces cache-awareness for workloads with predictable prefix reuse. It is effective when its estimations of the cache state are reliable.
-  * **Its Limitation:** The estimations can become stale at high scale or with dynamic workloads, leading to suboptimal routing.
+  * **Limitation:** The estimations can become stale at high scale or with dynamic workloads, leading to suboptimal routing.
 
 * **4. Precise Prefix-Cache Aware Scheduling**
   In production environments with tight SLOs - this is the most effective strategy for dynamic, high-scale workloads where maximizing the cache-hit ratio is a primary performance driver.