Remove poor-performing bitmasks, add Select trait, and enable select on integer bitmasks #482

calebzulawski · 2025-09-10T13:48:56Z

Summary:

Mask<T, N> == Simd<T, N>
Allow 0b0101.select(a, b)

This is a compromise. My justification:

There is no actual universal/portable mask implementation, but full width SIMD vector masks are pretty close.
Having a specified memory layout is simpler and some people will probably make the assumption anyway, even if we document otherwise.
AVX/AVX512 and NEON/SVE mean that on a single architecture you could have both types of masks. Using cfg and building std from source isn't really a good option. By providing both we can at least allow using each mask type in the same program depending on detected target features.
If Mask<T, N> is equivalent to Simd<T, N> in layout, I think we can finally put Change Mask element to match Simd element #322 to rest.

Since Mask is still represented as <N x i1> in LLVM, I think this mostly just affects memory layout. I think this actually improves codegen since the full width version of Mask is more transparent than the (deleted in this PR) AVX512 bitmask version.

Also, this PR allows selecting any vector from any width Mask with the same number of elements. I'm not sure if that should be supported or not, but Masks already implement Into to each other.

This is the first item on my list for trying to get everything resolved for stabilization :)

…it that also supports bitmasks.

programmerjake · 2025-09-11T03:02:52Z

I'm not so sure that we don't want an opaque Mask<T, N> type -- just because x86 and Arm Neon work fine with those types doesn't mean everything will work great with those types -- e.g. RISC-V uses bitmasks, and even if you use fixed-size SIMD values (relying on the minimum vector register sizes), using bitmasks should be more efficient.

If we want known-layout mask types, we could just export the FullMask<T, N> type (equivalent to Simd<i32, N> where T is the same size/align as i32 and so on) and the BitMask<N> type.

calebzulawski · 2025-09-11T05:27:16Z

Full vector Mask<T, N> is still pretty transparently <N x i1> so other than going to memory, it should work pretty much everywhere as you'd expect (RVV example). Once you're going to memory I feel like it's more useful to know the layout than having an opaque optimization to possibly remove one instruction. The underlying bitmask layout is hacky (to support arbitrary length bitmasks) so we lose the transparency and the codegen is terrible.

With this change you can still use Mask<T, N> most of the time, but allows you to opt to use u64 as a mask instead which is more transparently <64 x i1> and actually produces usable bitmask codegen.

My issue with keeping them the same type is that Mask<i32, 4> seems to have a very specific implication (it contains i32s) and a bitmask should look more like BitMask<4>, or literally just an integer. Maybe the language will gain arbitrary length integers or a bitset type to support that in the future.

calebzulawski · 2025-09-11T05:32:45Z

If we had a magic int<N> type that we could use as the bitmask implementation I'd probably be fine with that, but I don't want to wait to stabilize std::simd on that

calebzulawski · 2025-09-11T05:48:03Z

RVV doesn't have the bitmask implementation cfged right now anyway, but for reference, here's how those same two functions compile with it:

<masked_add>:
                addi    sp, sp, -0x10
                sd  a1, 0x8(sp)
                addi    a1, sp, 0x8
                vsetivli    zero, 0x5, e32, m2, ta, ma
                vlm.v   v0, (a1)
                vmv.v.i v8, 0x0
                vsetvli zero, zero, e32, m2, tu, mu
                vle32.v v8, (a3), v0.t
                vsetvli zero, zero, e32, m2, ta, ma
                vle32.v v10, (a2)
                vadd.vv v8, v10, v8
                vse32.v v8, (a0)
                addi    sp, sp, 0x10
                ret

Disassembly of section .text.bitmasked_add:

<bitmasked_add>:
                addi    sp, sp, -0x10
                sd  a1, 0x8(sp)
                addi    a1, sp, 0x8
                vsetivli    zero, 0x5, e32, m2, ta, ma
                vlm.v   v0, (a1)
                vmv.v.i v8, 0x0
                vsetvli zero, zero, e32, m2, tu, mu
                vle32.v v8, (a3), v0.t
                vsetvli zero, zero, e32, m2, ta, ma
                vle32.v v10, (a2)
                vadd.vv v8, v10, v8
                vse32.v v8, (a0)
                addi    sp, sp, 0x10
                ret

programmerjake · 2025-09-11T21:16:22Z

If we had a magic int<N> type that we could use as the bitmask implementation I'd probably be fine with that, but I don't want to wait to stabilize std::simd on that

I'm fine if we temporarily don't have the bitmask Mask while waiting for generic integers (we can stabilize it like that as long as we're very clear that it has unspecified layout and will probably change in the future on some targets), I'm not fine with guaranteeing the main Mask type is always a full mask.

calebzulawski · 2025-09-12T02:55:42Z

So maybe I keep this PR how it is (with the new Select trait) but add back the unspecified layout documentation. I'll make a follow up PR that changes the generic type to be the vector type rather than the full width mask integer type. With the addition of the Select trait it should make it possible to do e.g. Select<Simd<u64, N>> for Mask<f64, N>> (rather than needing .into() everywhere), that's probably good enough for me to be okay with the unspecified layout.

programmerjake · 2025-09-12T03:39:38Z

I think the full mask type should have the contained Simd type still use integers as the element type for better LLVM optimization even if it's fine to now allow Mask<f64, 4> or Mask<*const (), 2> -- so those would contain Simd<i64, 4> and Simd<isize, 2> respectively

calebzulawski · 2025-09-12T04:41:38Z

Ok, I reverted masks to have an unspecified layout. My plan in the follow-up is to change masks to struct Mask<T: SimdElement, const N: usize>(Simd<T::Mask, N>) which should match what you're describing.

This PR now just removes the bitmask implementation and changes mask.select() to be a trait

programmerjake · 2025-09-12T09:17:16Z

crates/core_simd/src/masks.rs

+        } else {
+            // Safety: bitmask matches length
+            unsafe { to_bitmask_impl::<T, u64, 64, N>(self) }
+        }


maybe add a todo!() for N > 64?

programmerjake · 2025-09-12T09:20:20Z

crates/core_simd/src/masks.rs

@@ -450,7 +491,8 @@ where
    type Output = Self;
    #[inline]
    fn bitand(self, rhs: Self) -> Self {
-        Self(self.0 & rhs.0)


this can just stay unchanged -- same for all the other bitwise ops.

calebzulawski added 2 commits September 10, 2025 00:17

Guarantee Mask has the same layout as Simd. Implement select as a tra…

1146930

…it that also supports bitmasks.

Remove mask backing implementations

b8df2c9

calebzulawski requested review from programmerjake and workingjubilee September 10, 2025 13:49

Fix clippy

e75a8d8

calebzulawski force-pushed the masks-v2-final-final-copy branch from 6efa217 to da8fe78 Compare September 12, 2025 04:38

calebzulawski changed the title ~~Define Mask<T, N> as equivalent to Simd<T, N> and enable select on integer bitmasks~~ Remove poor-performing bitmasks, add Select trait, and enable select on integer bitmasks Sep 12, 2025

calebzulawski force-pushed the masks-v2-final-final-copy branch from da8fe78 to cd88ff3 Compare September 12, 2025 06:02

calebzulawski added 2 commits September 12, 2025 02:36

Fix endianness correction

728f375

Revert layout guarantee

45a18ae

calebzulawski force-pushed the masks-v2-final-final-copy branch from cd88ff3 to 45a18ae Compare September 12, 2025 06:36

programmerjake reviewed Sep 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove poor-performing bitmasks, add Select trait, and enable select on integer bitmasks #482

Remove poor-performing bitmasks, add Select trait, and enable select on integer bitmasks #482

Uh oh!

calebzulawski commented Sep 10, 2025 •

edited

Loading

Uh oh!

programmerjake commented Sep 11, 2025

Uh oh!

calebzulawski commented Sep 11, 2025 •

edited

Loading

Uh oh!

calebzulawski commented Sep 11, 2025

Uh oh!

calebzulawski commented Sep 11, 2025 •

edited

Loading

Uh oh!

programmerjake commented Sep 11, 2025

Uh oh!

calebzulawski commented Sep 12, 2025 •

edited

Loading

Uh oh!

programmerjake commented Sep 12, 2025 •

edited

Loading

Uh oh!

calebzulawski commented Sep 12, 2025 •

edited

Loading

Uh oh!

programmerjake Sep 12, 2025

Uh oh!

programmerjake Sep 12, 2025

Uh oh!

Uh oh!

Remove poor-performing bitmasks, add Select trait, and enable select on integer bitmasks #482

Are you sure you want to change the base?

Remove poor-performing bitmasks, add Select trait, and enable select on integer bitmasks #482

Uh oh!

Conversation

calebzulawski commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

programmerjake commented Sep 11, 2025

Uh oh!

calebzulawski commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

calebzulawski commented Sep 11, 2025

Uh oh!

calebzulawski commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

programmerjake commented Sep 11, 2025

Uh oh!

calebzulawski commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

programmerjake commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

calebzulawski commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

programmerjake Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

programmerjake Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

calebzulawski commented Sep 10, 2025 •

edited

Loading

calebzulawski commented Sep 11, 2025 •

edited

Loading

calebzulawski commented Sep 11, 2025 •

edited

Loading

calebzulawski commented Sep 12, 2025 •

edited

Loading

programmerjake commented Sep 12, 2025 •

edited

Loading

calebzulawski commented Sep 12, 2025 •

edited

Loading