[WIP][aiter][flash_attention] update aiter and add attn #612

xuzhao9 · 2025-11-01T04:17:35Z

We are adding kernels on AMD in #604

Aiter seems to be strangely too fast? 10x faster than H100. Need to look deeper into this...

Test plan (on MI300) :

$ python run.py --op flash_attention --only flex_attention,triton_tutorial_flash_v2,aiter 

  (Batch, Heads, SeqLen, SeqLen_KV, Dhead)    flex_attention-latency    triton_tutorial_flash_v2-latency       aiter-latency
------------------------------------------  ------------------------  ----------------------------------  ------------------
                     (4, 48, 128, 128, 64)        0.078788 (±70.82%)                 0.024714 (±254.78%)  0.072060 (±67.70%)
                     (4, 48, 256, 256, 64)        0.085598 (±52.27%)                   0.028199 (±4.26%)  0.073061 (±67.38%)
                     (4, 48, 512, 512, 64)         0.126614 (±2.02%)                   0.066972 (±6.52%)   0.048226 (±8.14%)
                   (4, 48, 1024, 1024, 64)         0.417134 (±1.97%)                   0.210209 (±4.61%)   0.087360 (±3.26%)
                   (4, 48, 2048, 2048, 64)         1.316211 (±3.77%)                   0.630747 (±2.90%)   0.144759 (±3.87%)
                   (4, 48, 4096, 4096, 64)         5.113835 (±1.29%)                   2.290631 (±3.78%)   0.278904 (±4.06%)
                   (4, 48, 8192, 8192, 64)        19.780565 (±1.54%)                   8.985677 (±0.74%)   0.614846 (±3.44%)
                                   average

In comparison, H100:

  (Batch, Heads, SeqLen, SeqLen_KV, Dhead)    flex_attention-latency    triton_tutorial_flash_v2-latency    triton_tutorial_flash_v2_tma-latency    cudnn-91002-latency
------------------------------------------  ------------------------  ----------------------------------  --------------------------------------  ---------------------
                     (4, 48, 128, 128, 64)         0.027424 (±6.07%)                   0.014112 (±4.31%)                       0.017888 (±3.40%)      0.019584 (±4.25%)
                     (4, 48, 256, 256, 64)         0.038368 (±6.01%)                   0.023648 (±2.98%)                       0.031232 (±2.97%)      0.024128 (±3.98%)
                     (4, 48, 512, 512, 64)         0.065472 (±3.13%)                   0.048160 (±1.59%)                       0.062976 (±1.32%)      0.042592 (±2.10%)
                   (4, 48, 1024, 1024, 64)         0.166912 (±1.38%)                   0.146848 (±0.63%)                       0.170944 (±0.66%)      0.125120 (±0.72%)
                   (4, 48, 2048, 2048, 64)         0.564256 (±0.59%)                   0.530784 (±0.40%)                       0.564352 (±1.28%)      0.440384 (±0.97%)
                   (4, 48, 4096, 4096, 64)         2.111392 (±0.81%)                   2.040704 (±0.98%)                       2.135584 (±2.41%)      1.676480 (±0.64%)
                   (4, 48, 8192, 8192, 64)         8.297536 (±0.14%)                   7.925888 (±2.41%)                       8.023680 (±0.31%)      6.624832 (±1.55%)
                                   average

update aiter

e074771

xuzhao9 had a problem deploying to docker-s3-upload November 1, 2025 04:17 — with GitHub Actions Error

meta-cla bot added the cla signed label Nov 1, 2025

fix lint

1324d10

xuzhao9 temporarily deployed to docker-s3-upload November 1, 2025 04:20 — with GitHub Actions Inactive

xuzhao9 requested review from robieta and removed request for robieta November 1, 2025 04:24

xuzhao9 changed the title ~~[aiter][flash_attention] update aiter and add attn~~ [WIP][aiter][flash_attention] update aiter and add attn Nov 1, 2025

xuzhao9 requested a review from robieta November 1, 2025 04:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][aiter][flash_attention] update aiter and add attn #612

[WIP][aiter][flash_attention] update aiter and add attn #612

Uh oh!

xuzhao9 commented Nov 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP][aiter][flash_attention] update aiter and add attn #612

Are you sure you want to change the base?

[WIP][aiter][flash_attention] update aiter and add attn #612

Uh oh!

Conversation

xuzhao9 commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xuzhao9 commented Nov 1, 2025 •

edited

Loading