Skip to content

Conversation

liquidcarbon
Copy link

@liquidcarbon liquidcarbon commented Apr 22, 2025

Fix for #595

Little something to get going.

There appears to be unused global dtypes which I've extended with np.int8 and np.uint8 and used in the first test.

Tests pass with int8 but not uint8.

Shall I modify the rest of the tests in a similar manner? (will check for nuances and fix formatting)

@ashvardanian
Copy link
Contributor

I'll take it from here, thanks @liquidcarbon!

@ashvardanian
Copy link
Contributor

Hi @liquidcarbon! The issue is coming from the constraints of Python's & NumPy's type-system. There is no separate definition for bit-packs, so I've reserved the u8 values in Python for bit arrays. So your inputs are being treated as individual bits. In SimSIMD, I've addressed it by forcing users to pass an additional dtype-like parameter (like "bin8" for 8-bit packs).

I can perform a similar change in USearch, but it will break the existing behaviour. I'll tag it accordingly and get back to it for v3. Thanks!

@ashvardanian ashvardanian added the v3 Breaking changes planned for v3 label May 14, 2025
@liquidcarbon
Copy link
Author

Maybe it's a good idea to steer clear of unsigned ints altogether.
image

@ashvardanian ashvardanian linked an issue Jun 3, 2025 that may be closed by this pull request
3 tasks
@ashvardanian
Copy link
Contributor

Hi @liquidcarbon! Thanks for all the work! I've merged it into the working draft of v3 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v3 Breaking changes planned for v3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: unsigned uint8 misbehaves when building an index
2 participants