Skip to content

Commit cfe1e8b

Browse files
authored
Merge pull request #449 from dwithchenna/rai-1.5.1-release
Updates for the RAI 1.5.1 release
2 parents 87e7ab2 + aee1457 commit cfe1e8b

File tree

3 files changed

+10
-3
lines changed

3 files changed

+10
-3
lines changed

docs/inst.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,10 @@ Install Ryzen AI Software
7575

7676
The Ryzen AI Software packages are now installed in the conda environment created by the installer.
7777

78+
.. note::
79+
80+
The latest updates with LLM performance improvements are available in the RAI 1.5.1 release. Download it from the link :download:`ryzen-ai-1.5.1.msi <https://account.amd.com/en/forms/downloads/ryzen-ai-software-platform-xef.html?filename=ryzen-ai-1.5.1.msi>`.
81+
7882

7983
.. _quicktest:
8084

docs/modelrun.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -388,7 +388,7 @@ In the example above, the cache directory is set to the absolute path of the fol
388388
ONNX Runtime EP Context Cache
389389
=============================
390390

391-
The Vitis AI EP supports the ONNX Runtime EP context cache feature. This features allows dumping and reloading a snapshot of the EP context before deployment. Currently, this feature is only available for INT8 models.
391+
The Vitis AI EP supports the ONNX Runtime EP context cache feature. This features allows dumping and reloading a snapshot of the EP context before deployment.
392392

393393
The user can enable dumping of the EP context by setting the ``ep.context_enable`` session option to 1.
394394

docs/oga_model_prepare.rst

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,10 @@ Generate the final model for NPU execution mode:
116116
117117
Known Issue: In the current version, Mistral-7B-Instruct-v0.1 has a known issue during OGA model conversion in the postprocessing stage.
118118

119-
New in 1.5.1:
119+
120+
New in 1.5.1:
121+
============
122+
120123

121124
In Release 1.5.1 there is a new option added to generate prefill fused version of Hybrid Model. Currently it is tested for `Phi-3.5-mini-instruct`, `Llama-2-7b-chat-hf` and `Llama-3.1-8B-Instruct`.
122125

@@ -137,11 +140,11 @@ After the model is generated, locate the ``genai_config.json`` file inside the m
137140
3. Set ``dd_cache`` to ``<output_dir>\\.cache``, for example ``"dd_cache": "C:\\Users\\user\\<generated model folder>\\.cache"``
138141
4. For ``Phi-3.5-mini-instruct``, ``Llama-2-7b-chat-hf model``
139142

143+
140144
- Set ``"hybrid_opt_disable_npu_ops": "1"`` inside ``"amd_options"``.
141145
- Set ``"fusion_opt_io_bind_kv_cache": "1"`` inside ``"amd_options"``.
142146
- Set ``"flattened_kv": true`` inside ``"search"``.
143147

144-
145148
..
146149
------------
147150

0 commit comments

Comments
 (0)