-
Notifications
You must be signed in to change notification settings - Fork 293
update gpu block size based on xattn #2764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
peterchen-intel
merged 29 commits into
openvinotoolkit:master
from
rnwang04:pa_block_xattn
Oct 28, 2025
Merged
Changes from 6 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
352c7a0
update gpu block size based on xattn
rnwang04 51d0018
update gpu block size based on xattn
rnwang04 eded411
merge
rnwang04 ac7a454
add missing GenerationConfig
rnwang04 e6ba90a
fix use_sparse_attention=False
rnwang04 4101008
Merge branch 'master' into pa_block_xattn
ceciliapeng2011 9dcb2da
update GenerationConfig based on comments
rnwang04 c5c67ba
refactor: get gpu block_size from value_cache.
ceciliapeng2011 9ccf083
update solution for use_sparse_attention=False based on comments
rnwang04 79fd027
Merge branch 'pa_block_xattn' of https://github.com/rnwang04/openvino…
rnwang04 c96728c
Merge branch 'master' into pa_block_xattn
rnwang04 9c27fc1
Merge branch 'master' into pa_block_xattn
peterchen-intel 9ab91e1
add log to show if XAttention is actually ON/OFF.
ceciliapeng2011 526ba22
fix
ceciliapeng2011 7bd2aaf
wwb support xattention
wgzintel 9c752b9
remove blank line
wgzintel 13053a9
refactoring the code
wgzintel 8d481e8
Merge pull request #2 from wgzintel/guozhong/wwb_support_xattention
rnwang04 a740969
Merge branch 'master' into pa_block_xattn
peterchen-intel 3cdda53
Code format update
peterchen-intel 516bff5
Merge branch 'master' into pa_block_xattn
peterchen-intel 9f46be6
refactor based on copilot review
ceciliapeng2011 77f7961
fix lint error
ceciliapeng2011 66950d5
fix lint error
ceciliapeng2011 237cb5c
Merge branch 'master' into pa_block_xattn
peterchen-intel e5bd53e
Merge branch 'master' into pa_block_xattn
peterchen-intel 5d1e2e6
Merge branch 'master' into pa_block_xattn
peterchen-intel 2782171
Merge branch 'master' into pa_block_xattn
ceciliapeng2011 4669a25
remove other changes, only keep gpu block size
rnwang04 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.