[Float8] register fp8 quant/dequant only on CPU #2961

shiyang-weng · 2025-09-09T03:18:11Z

What we want to do is to enable FP8 quantization in PyTorch. Similar to INT8 quantization, this requires inserting quantize and dequantize operations into the computational graph. In order to reuse pattern matching logic of int8, we need register FP8 quant and dequant.

To address this, we attempted to register quant in #2379, but the PR was reverted in #2672 because it caused performance regression on H100 GPUs. And there is no need to register q/dq on CUDA.

Based on the above reasons, I register quant specifically for CPU.

pytorch-bot · 2025-09-09T03:18:15Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2961

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 88d5158 with merge base 3760978 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168

seems OK to me, wondering if @vkuzo has additional thoughts, not sure if there is a better alternative here to support preserving ops for cpu

shiyang-weng · 2025-09-11T01:53:56Z

@vkuzo Could you help review this PR?

jerryzh168 · 2025-09-11T02:13:00Z

@vkuzo Could you help review this PR?

is this urgent? Vasiliy is not available recently and will be back next week

shiyang-weng · 2025-09-11T03:23:26Z

is this urgent? Vasiliy is not available recently and will be back next week

Thanks for letting me know. Not urgent. We can wait for him back next week

register fp8 quant/dequant only on CPU

88d5158

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 9, 2025

shiyang-weng marked this pull request as draft September 9, 2025 03:18

Xia-Weiwen approved these changes Sep 9, 2025

View reviewed changes

Xia-Weiwen requested review from jerryzh168, andrewor14 and drisspg September 9, 2025 07:42

Xia-Weiwen added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Sep 9, 2025

shiyang-weng marked this pull request as ready for review September 10, 2025 01:28

jerryzh168 requested a review from vkuzo September 11, 2025 00:10

jerryzh168 approved these changes Sep 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Float8] register fp8 quant/dequant only on CPU #2961

[Float8] register fp8 quant/dequant only on CPU #2961

shiyang-weng commented Sep 9, 2025

Uh oh!

pytorch-bot bot commented Sep 9, 2025 •

edited

Loading

Uh oh!

jerryzh168 left a comment

Uh oh!

shiyang-weng commented Sep 11, 2025

Uh oh!

jerryzh168 commented Sep 11, 2025

Uh oh!

shiyang-weng commented Sep 11, 2025

Uh oh!

Uh oh!

[Float8] register fp8 quant/dequant only on CPU #2961

Are you sure you want to change the base?

[Float8] register fp8 quant/dequant only on CPU #2961

Conversation

shiyang-weng commented Sep 9, 2025

Uh oh!

pytorch-bot bot commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2961

✅ No Failures

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

shiyang-weng commented Sep 11, 2025

Uh oh!

jerryzh168 commented Sep 11, 2025

Uh oh!

shiyang-weng commented Sep 11, 2025

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 9, 2025 •

edited

Loading