heterogeneous quantization for different backends #11713

goka-wu · 2025-06-16T02:44:00Z

goka-wu
Jun 16, 2025

My team aims to develop a heterogeneous inference framework using ExecuTorch, but we are currently grappling with challenges in heterogeneous quantization.

Consider this scenario: For a single model, we plan to simultaneously delegate computations to NPU-A, NPU-B, and CPU backends. The CPU will utilize XNNPACKQuantizer, while NPU-A and NPU-B require custom quantization algorithms. How can we apply these three distinct quantization methods to their respective partitions before the graph is partitioned?

Based on my understanding of ExecuTorch's workflow: Quantization at Aten IR → Partitioning at EdgeIR,
if the graph is partitioned into:

P1 (executed on NPU-A)
P2 (executed on NPU-B)
P3 (executed on CPU via XNNPACK)

How can we ensure that the NPU-A-specific quantization algorithm is applied to P1 during the Aten IR quantization stage, given that partitioning occurs later at the Edge IR stage?

yujiaoliang · 2025-09-16T07:57:04Z

yujiaoliang
Sep 16, 2025

I also encountered the same issue. After quantization, the partition fails.

0 replies

GregoryComer · 2025-09-20T02:03:36Z

GregoryComer
Sep 20, 2025
Collaborator

@kimishpatel @metascroy Do you have any suggestions on how to handle this?

0 replies

kimishpatel · 2025-09-20T03:22:06Z

kimishpatel
Sep 20, 2025
Collaborator

have you looked at https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/quantizer/composable_quantizer.py#L17

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

heterogeneous quantization for different backends #11713

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

heterogeneous quantization for different backends #11713

Uh oh!

goka-wu Jun 16, 2025

Replies: 3 comments

Uh oh!

yujiaoliang Sep 16, 2025

Uh oh!

GregoryComer Sep 20, 2025 Collaborator

Uh oh!

kimishpatel Sep 20, 2025 Collaborator

goka-wu
Jun 16, 2025

yujiaoliang
Sep 16, 2025

GregoryComer
Sep 20, 2025
Collaborator

kimishpatel
Sep 20, 2025
Collaborator