Merge pull request #449 from dwithchenna/rai-1.5.1-release

dwithchenna · web-flow · commit cfe1e8b61ee4 · 2025-07-25T16:23:30.000-07:00
Updates for the RAI 1.5.1 release
diff --git a/docs/inst.rst b/docs/inst.rst
@@ -75,6 +75,10 @@ Install Ryzen AI Software
 
 The Ryzen AI Software packages are now installed in the conda environment created by the installer.
 
+.. note::
+
+     The latest updates with LLM performance improvements are available in the RAI 1.5.1 release. Download it from the link :download:`ryzen-ai-1.5.1.msi <https://account.amd.com/en/forms/downloads/ryzen-ai-software-platform-xef.html?filename=ryzen-ai-1.5.1.msi>`.
+
 
 .. _quicktest:
 
diff --git a/docs/modelrun.rst b/docs/modelrun.rst
@@ -388,7 +388,7 @@ In the example above, the cache directory is set to the absolute path of the fol
 ONNX Runtime EP Context Cache
 =============================
 
-The Vitis AI EP supports the ONNX Runtime EP context cache feature. This features allows dumping and reloading a snapshot of the EP context before deployment. Currently, this feature is only available for INT8 models.
+The Vitis AI EP supports the ONNX Runtime EP context cache feature. This features allows dumping and reloading a snapshot of the EP context before deployment. 
 
 The user can enable dumping of the EP context by setting the ``ep.context_enable`` session option to 1.
 
diff --git a/docs/oga_model_prepare.rst b/docs/oga_model_prepare.rst
@@ -116,7 +116,10 @@ Generate the final model for NPU execution mode:
 
 Known Issue: In the current version, Mistral-7B-Instruct-v0.1 has a known issue during OGA model conversion in the postprocessing stage.
 
-New in 1.5.1: 
+
+New in 1.5.1:
+============
+
 
 In Release 1.5.1 there is a new option added to generate prefill fused version of Hybrid Model. Currently it is tested for `Phi-3.5-mini-instruct`, `Llama-2-7b-chat-hf` and `Llama-3.1-8B-Instruct`. 
 
@@ -137,11 +140,11 @@ After the model is generated, locate the ``genai_config.json`` file inside the m
 3. Set ``dd_cache`` to ``<output_dir>\\.cache``, for example ``"dd_cache": "C:\\Users\\user\\<generated model folder>\\.cache"``
 4. For ``Phi-3.5-mini-instruct``, ``Llama-2-7b-chat-hf model``
 
+
    - Set ``"hybrid_opt_disable_npu_ops": "1"`` inside ``"amd_options"``.
    - Set ``"fusion_opt_io_bind_kv_cache": "1"`` inside ``"amd_options"``.
    - Set ``"flattened_kv": true`` inside ``"search"``.
 
-
 ..
   ------------