You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 9, 2024. It is now read-only.
How to understand this note: "note: Since Deepspeed-ZeRO can process multiple generate streams in parallel its throughput can be further divided by 8 or 16 ..." #99
Why is it said that only ds_zero is currently doing world_size streams on world_size gpus, while acclerate and ds inference should be doing the same as well since they also use multiprocessing?