Skip to content

Conversation

@peterchen-intel
Copy link
Contributor

@peterchen-intel peterchen-intel commented Oct 25, 2025

Details:

  • XAttention for FP16 KVCache as a preview feature
  • to add unit tests
  • to disable XAttention for legacy platforms (XAttention kernels are implemented for Xe2/Xe3 with CM)
  • to streamline the process of xattention. Currently kvcache shape is used to determine it. Maybe there is a better approach.
  • to add warning message for unsupported cases: multiple subsequences, typo error of kvcache precision, etc.
  • to remove the trivial converter nodes from xattention_threshold Parameter to PageAttention input.
  • to refactor xattention kernel impls by reusing RT parameters, instead of recomputing them.
  • to enable path of U8 KVCache (stretch goal)
  • WWB with long prompts

This PR should work along with openvinotoolkit/openvino.genai#2764.

Tickets:

Reviews and approvals are in #32064

Same as PR32064 14e57f9
$ git fetch origin pull/32064/head:pr/32064
$ git fetch origin pull/32551/head:pr/32551
$ git diff pr/32551 pr/32064
(empty)

To resolve PR32064 commits checks issue. one commit is signed by unknown author. 5201cdf
https://docs.github.com/en/github/authenticating-to-github/managing-commit-signature-verification/about-commit-signature-verification

riverlijunjie and others added 30 commits October 25, 2025 22:54
   1. kvcache update's k/v offset issue
   2. 2nd token lse data overflow issue
@peterchen-intel peterchen-intel requested review from a team as code owners October 25, 2025 16:37
@peterchen-intel peterchen-intel requested review from CuriousPanCake and removed request for a team October 25, 2025 16:37
@github-actions github-actions bot added category: GPU OpenVINO GPU plugin category: transformations OpenVINO Runtime library - Transformations labels Oct 25, 2025
@peterchen-intel peterchen-intel removed the request for review from CuriousPanCake October 26, 2025 01:50
@peterchen-intel peterchen-intel added this pull request to the merge queue Oct 26, 2025
@peterchen-intel
Copy link
Contributor Author

Reviews and approvals are in #32064

Merged via the queue into openvinotoolkit:master with commit 6d052b6 Oct 26, 2025
235 of 237 checks passed
@peterchen-intel peterchen-intel deleted the xattn/merge branch October 26, 2025 10:37
github-merge-queue bot pushed a commit to openvinotoolkit/openvino.genai that referenced this pull request Oct 28, 2025
- block size for XAttention
- work with
-- openvinotoolkit/openvino#32064
-- openvinotoolkit/openvino#32551

Tickets: CVS-173845

Related PRs
- #2924
- #2927

---------

Co-authored-by: cecilia peng <[email protected]>
Co-authored-by: Chen Peter <[email protected]>
Co-authored-by: wgzintel <[email protected]>
Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin category: transformations OpenVINO Runtime library - Transformations Code Freeze

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants