-
Notifications
You must be signed in to change notification settings - Fork 79
Adding multisample feature along with testcases #740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Adding multisample feature along with testcases #740
Conversation
1b01b6f to
6a77302
Compare
|
@tchaton @deependujha @bhimrazy Can you verify the approach once? I can then make changes to the README. |
src/litdata/streaming/dataset.py
Outdated
| index_path: Optional[str] = None, | ||
| force_override_state_dict: bool = False, | ||
| transform: Optional[Union[Callable, list[Callable]]] = None, | ||
| is_multisample: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how will you know how many sample_count user wants?
| is_multisample: bool = False, | |
| sample_count: int = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better. I'll add this.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #740 +/- ##
===================================
Coverage 80% 80%
===================================
Files 52 52
Lines 7330 7345 +15
===================================
+ Hits 5869 5884 +15
Misses 1461 1461 🚀 New features to boost your workflow:
|
Before submitting
What does this PR do?
Fixes #317
PR review
Added support for multisample item.
Basically added a sample_count parameter which creates a batch of sub samples for each sample, given a single transform function.
Sample code:
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃