-
Couldn't load subscription status.
- Fork 112
generic_float for Float8E8M0
#4403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mlir_mxfp4_test
Are you sure you want to change the base?
Conversation
| { | ||
| public: | ||
| static constexpr bool has_infinity = true; | ||
| static constexpr bool has_infinity = not(M == 0 and E == 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tidy is being nitpicky here but It saves you the extra not if you make this (M == 0 or E == 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually a bug. Should probably just be M != 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tidy is not suggesting (M == 0 or E == 0), it is suggesting M != 0 or E != 0.
|
So far so good - Tidy seems a bit heavy handed here and CI is giving a nonsensical error for some reason across builds Let me know when you want me to look at this again and its ready for full review unless this needs to be taken out of draft. Not really issues with this right now if that's the case |
| { | ||
| uint8_t exponent; | ||
|
|
||
| static constexpr int exponent_bias() { return all_ones<7>(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The bias is just an integer, per the standard. it should ideally be just referred to by its value: 127, or something like that (a named constant). But defining it all_ones<7> is confusing.
| }; | ||
|
|
||
| template <unsigned int Flags> | ||
| struct __attribute__((packed, may_alias)) generic_float<0, 8, Flags> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Flags used somewhere in E8M0 case here? I must have missed.
|
|
||
| static constexpr int exponent_bias() { return all_ones<7>(); } | ||
|
|
||
| explicit constexpr generic_float(float f = 1.0) noexcept { from_float(get_parts(f)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be defining a generic_float for other formats with a different default value?
| static constexpr generic_float infinity() | ||
| { | ||
| generic_float x{}; | ||
| x.exponent = all_ones<8>() >> 1u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I don't get. In Looks like you are converting infinity to the max() here. Maybe just use max() in that case?fp32, infinity has its exponent defined as 0xff. why is the above line shifting right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no infinity for E8M0 so I tried to follow the convention integral types have for this https://en.cppreference.com/w/cpp/types/numeric_limits/infinity.html. Unfortunately, there is also no zero in E8M0 so this decision is arbitrary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree there is no zero involved here :-)
Should an exception be generated? Because it is the wrong math otherwise for E8M0.
Ideally the test could verify that an exception is being thrown.
These are hairy corner cases.
| // min value = 2**(-127) | ||
| static constexpr generic_float min() | ||
| { | ||
| generic_float x{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not define generic_float natively -- rather than go through a default fp32 parameter in its constructor, subsequently its exponent is set!
| } | ||
|
|
||
| // next number from 2**0 is 2**1 so epsilon is 2**0 | ||
| static constexpr generic_float epsilon() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I were to nit-pick, there is no epsilon or a concept like that for the weight specific E8M0 format.
| } | ||
|
|
||
| friend constexpr bool operator==(const generic_float& x, const generic_float& y) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just say x == y. Is that not compliable?
|
|
||
| constexpr generic_float& operator++() noexcept | ||
| { | ||
| ++exponent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You perhaps should check for a number to not become a NaN. It should roll over.
Motivation
Technical Details
Changelog Category