You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was bothered by the presumed waste of using 12 bytes to track which type slots are filled, rather than 12 bits.
So today I did the work to adapt the basic object structure like so: (full code at jepler@bitfield-slots)
// This struct is a variable sized struct. In order to use this as a // member, or allocate dynamically, use the mp_obj_empty_type_t or// mp_obj_full_type_t structs below (which must be kept in sync). struct_mp_obj_type_t {
// A type is an object so must start with this entry, which points to mp_type_type.mp_obj_base_tbase;
// Flags associated with this type.uint16_tflags;
// The name of this type, a qstr.uint16_tname;
// Slots: If a slot is populated, the corresponding bit in `filled_slots` is set.// The index in `slots[]` is simply the number of set bits "below" that one,// which can be determined with __builtin_popcnt().uint16_tfilled_slots;
// (16 spare bits here on most ports!)constvoid*slots[];
};
.. and changed the rest of the core as needed until unix make VARIANT=coverage test passed.
Then I noticed that cortex m4 (and everything below it) lacks a popcnt instruction. So this saved a mere 4 bytes on the "itsybitsy" build, and probably tanked performance due to all the introduced __popcountsi2 calls.. And it makes the MP_DEFINE_CONST_OBJ_TYPE_NARGS_ macros much more gruesome.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I was bothered by the presumed waste of using 12 bytes to track which type slots are filled, rather than 12 bits.
So today I did the work to adapt the basic object structure like so: (full code at jepler@bitfield-slots)
.. and changed the rest of the core as needed until unix
make VARIANT=coverage test
passed.Then I noticed that cortex m4 (and everything below it) lacks a popcnt instruction. So this saved a mere 4 bytes on the "itsybitsy" build, and probably tanked performance due to all the introduced
__popcountsi2
calls.. And it makes theMP_DEFINE_CONST_OBJ_TYPE_NARGS_
macros much more gruesome.Beta Was this translation helpful? Give feedback.
All reactions