[deepseek v3.2] Update deepgemm version (#117)

heheda12345 · web-flow · commit cf5bdeedc1da · 2025-11-06T09:43:11.000-08:00
Signed-off-by: Chen Zhang &lt;zhangch99@outlook.com&gt;
diff --git a/DeepSeek/DeepSeek-V3_2-Exp.md b/DeepSeek/DeepSeek-V3_2-Exp.md
@@ -7,7 +7,7 @@
 
 ```bash
 uv pip install vllm --extra-index-url https://wheels.vllm.ai/nightly
-uv pip install https://wheels.vllm.ai/dsv32/deep_gemm-2.1.0%2B594953a-cp312-cp312-linux_x86_64.whl
+uv pip install git+https://github.com/deepseek-ai/DeepGEMM.git@v2.1.1.post3 --no-build-isolation # Other versions may also work. We recommend using the latest released version from https://github.com/deepseek-ai/DeepGEMM/releases
 ```
 
 Note: DeepGEMM is used in two places: MoE and MQA logits computation. It is necessary for MQA logits computation. If you want to disable the MoE part, you can set `VLLM_USE_DEEP_GEMM=0` in the environment variable. Some users reported that the performance is better with `VLLM_USE_DEEP_GEMM=0`, e.g. on H20 GPUs. It might be also beneficial to disable DeepGEMM if you want to skip the long warmup.