Skip to content

Conversation

@clems4ever
Copy link

@clems4ever clems4ever commented Sep 26, 2025

In most cases, users of the api will simply pass a string to the tokenizer to get a list of tokens to pass to the model. In that case, there is no need to call the reflect library at all, which might save important cpu cycles and consequently improve the throughput of the services using the tokenizer library. This is an issue for Batch tokenization, not Single tokenization btw.

…rary

In most cases, users of the api will simply pass a string to the
tokenizer to get a list of tokens to pass to the model. In that case,
there is no need to call the reflect library at all, which might save
important cpu cycles and consequently improve the throughput of the
services using the tokenizer library.
@clems4ever
Copy link
Author

Hello. Thanks for maintaining this awesome project. I'm about to use the library to compute embeddings in a potentially high throughput scenario, so I thought this optimization might be worth it since it is a low hanging fruit. Please let me know what you think. Thanks.

@clems4ever clems4ever changed the title publicly expose NewRawInputSequence to prevent costly reflection calls Make NewRawInputSequence public to prevent costly reflection Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant