Skip to content

Conversation

@CharlieL7
Copy link
Collaborator

Motivation

  • Introduce Float8E8M0 type within MIGraphX for better MXFP4 optimizations and to use hipblaslt mxfp4 kernels.

Technical Details

Changelog Category

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

{
public:
static constexpr bool has_infinity = true;
static constexpr bool has_infinity = not(M == 0 and E == 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tidy is being nitpicky here but It saves you the extra not if you make this (M == 0 or E == 0)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually a bug. Should probably just be M != 0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tidy is not suggesting (M == 0 or E == 0), it is suggesting M != 0 or E != 0.

@TedThemistokleous
Copy link
Collaborator

TedThemistokleous commented Oct 25, 2025

So far so good - Tidy seems a bit heavy handed here and CI is giving a nonsensical error for some reason across builds

[2025-10-23T22:49:15.868Z] /var/jenkins/workspace/AMDMIGraphX_PR-4403/src/targets/gpu/hip_gemm_impl.cpp:77:5: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions]

[2025-10-23T22:49:15.868Z]    77 |     }

[2025-10-23T22:49:15.868Z]       |     ^

[2025-10-23T22:49:15.868Z] 1 error generated.

[2025-10-23T22:49:15.868Z] make[2]: *** [src/targets/gpu/CMakeFiles/migraphx_gpu.dir/build.make:370: src/targets/gpu/CMakeFiles/migraphx_gpu.dir/hip_gemm_impl.cpp.o] Error 1

[2025-10-23T22:49:15.868Z] make[2]: *** Waiting for unfinished jobs....

[2025-10-23T22:49:15.868Z] /var/jenkins/workspace/AMDMIGraphX_PR-4403/src/targets/gpu/gemm_impl.cpp:79:5: error: label at end of compound statement is a C++23 extension [-Werror,-Wc++23-extensions]

[2025-10-23T22:49:15.868Z]    79 |     }

[2025-10-23T22:49:15.868Z]       |     ^

[2025-10-23T22:49:15.868Z] 1 error generated.

[2025-10-23T22:49:15.868Z] make[2]: *** [src/targets/gpu/CMakeFiles/migraphx_gpu.dir/build.make:328: src/targets/gpu/CMakeFiles/migraphx_gpu.dir/gemm_impl.cpp.o] Error 1

[2025-10-23T22:49:30.250Z] make[2]: Leaving directory '/var/jenkins/workspace/AMDMIGraphX_PR-4403/build'

[2025-10-23T22:49:30.251Z] make[1]: *** [CMakeFiles/Makefile2:21332: src/targets/gpu/CMakeFiles/migraphx_gpu.dir/all] Error 2

[2025-10-23T22:49:30.251Z] make[1]: Leaving directory '/var/jenkins/workspace/AMDMIGraphX_PR-4403/build'

Let me know when you want me to look at this again and its ready for full review unless this needs to be taken out of draft. Not really issues with this right now if that's the case

{
uint8_t exponent;

static constexpr int exponent_bias() { return all_ones<7>(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The bias is just an integer, per the standard. it should ideally be just referred to by its value: 127, or something like that (a named constant). But defining it all_ones<7> is confusing.

};

template <unsigned int Flags>
struct __attribute__((packed, may_alias)) generic_float<0, 8, Flags>
Copy link
Contributor

@lakhinderwalia lakhinderwalia Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Flags used somewhere in E8M0 case here? I must have missed.


static constexpr int exponent_bias() { return all_ones<7>(); }

explicit constexpr generic_float(float f = 1.0) noexcept { from_float(get_parts(f)); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be defining a generic_float for other formats with a different default value?

static constexpr generic_float infinity()
{
generic_float x{};
x.exponent = all_ones<8>() >> 1u;
Copy link
Contributor

@lakhinderwalia lakhinderwalia Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I don't get. In fp32, infinity has its exponent defined as 0xff. why is the above line shifting right? Looks like you are converting infinity to the max() here. Maybe just use max() in that case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no infinity for E8M0 so I tried to follow the convention integral types have for this https://en.cppreference.com/w/cpp/types/numeric_limits/infinity.html. Unfortunately, there is also no zero in E8M0 so this decision is arbitrary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree there is no zero involved here :-)
Should an exception be generated? Because it is the wrong math otherwise for E8M0.
Ideally the test could verify that an exception is being thrown.
These are hairy corner cases.

// min value = 2**(-127)
static constexpr generic_float min()
{
generic_float x{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not define generic_float natively -- rather than go through a default fp32 parameter in its constructor, subsequently its exponent is set!

}

// next number from 2**0 is 2**1 so epsilon is 2**0
static constexpr generic_float epsilon()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were to nit-pick, there is no epsilon or a concept like that for the weight specific E8M0 format.

}

friend constexpr bool operator==(const generic_float& x, const generic_float& y)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just say x == y. Is that not compliable?


constexpr generic_float& operator++() noexcept
{
++exponent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You perhaps should check for a number to not become a NaN. It should roll over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants