Skip to content

Conversation

@CuiYifeng
Copy link
Contributor

To solve #2305.
This PR adds support for copying tensors with the Float4_e2m1fn_x2 data type on XPU devices.

Copilot AI review requested due to automatic review settings November 7, 2025 08:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables copying tensors with the Float4_e2m1fn_x2 data type on XPU devices by adding kernel support and extending test coverage. This addresses issue #2305 by implementing the missing copy operation for this float4 type.

Key changes:

  • Added float4_copy_kernel_xpu function to handle Float4_e2m1fn_x2 copy operations
  • Extended test coverage to include Float4_e2m1fn_x2 dtype in copy/clone tests

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/ATen/native/xpu/sycl/CopyKernel.cpp Implements float4 copy kernel and integrates it into the main copy_kernel dispatch logic
test/regressions/test_copy.py Adds Float4_e2m1fn_x2 to the test dtypes for copy and clone operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +114 to +122
void float4_copy_kernel_xpu(TensorIteratorBase& iter) {
ScalarType src_dtype = iter.dtype(1);

if (src_dtype == kFloat4_e2m1fn_x2) {
gpu_kernel_nocast(iter, CopyScalarFunc<Float4_e2m1fn_x2>());
} else {
TORCH_CHECK(false, "Copy from ", src_dtype, " to Float4_e2m1fn_x2 has not been supported.");
}
}
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation only supports Float4_e2m1fn_x2 to Float4_e2m1fn_x2 copy, but the pattern in float8_copy_kernel_xpu shows support for casting from common types (float, Half, BFloat16). Consider adding similar casting support for Float4_e2m1fn_x2 to match the capabilities of the float8 implementation and provide a more complete copy operation.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants