- 
                Notifications
    You must be signed in to change notification settings 
- Fork 4.6k
          Add pyproject.toml with legacy build backend to keep most logic in setup.py
          #7033
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| @mrwyattii we just went through some of this with arctic training. If it’s helpful @loadams let’s discuss on slack a bit. There’s a ton that’s currently happening in setup.py, this could be a big lift? But I agree, needs to happen! | 
pyproject.toml with legacy build backend to keep most logic in setup.py
      | Edit: this is no longer correct with latest changes. The current problem is that the logic inside setup.py aside from the call to  | 
This change is required to successfully build fp_quantizer extension on ROCm. --------- Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
cc @tjruwase @jomayeri --------- Co-authored-by: root <root@ftqtmec25000000.taxzvufipdhelhupulxcbvr15f.ux.internal.cloudapp.net> Signed-off-by: Logan Adams <[email protected]>
Fix #7029 - Add Chinese blog for deepspeed windows - Fix format in README.md Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Adding compile support for AIO library on AMD GPUs. --------- Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
Make trace cache warnings configurable, and disabled by default. Fix #6985, #4081, #5033, #5006, #5662 --------- Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Update CUDA compute capability for cross compile according to wiki page. https://en.wikipedia.org/wiki/CUDA#GPUs_supported --------- Signed-off-by: Hongwei <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
…ently, so we aren't seeing cupy installed. Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
Propagate API change. Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: Logan Adams <[email protected]>
- add zero2 test - minor fix with transformer version update & ds master merge. Signed-off-by: inkcherry <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Signed-off-by: Logan Adams <[email protected]>
bf16 with moe refresh optimizer state from bf16 ckpt will raise IndexError: list index out of range Signed-off-by: shaomin <[email protected]> Co-authored-by: shaomin <[email protected]> Co-authored-by: Hongwei Chen <[email protected]> Signed-off-by: Logan Adams <[email protected]>
**Auto-generated PR to update version.txt after a DeepSpeed release** Released version - 0.16.4 Author - @loadams Co-authored-by: loadams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
@jeffra and I fixed this many years ago, so bringing this doc to a correct state. --------- Signed-off-by: Stas Bekman <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Description This PR includes Tecorigin SDAA accelerator support. With this PR, DeepSpeed supports SDAA as backend for training tasks. --------- Signed-off-by: siqi <[email protected]> Co-authored-by: siqi <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Keeps lines within PEP 8 length limits. Enhances readability with a single, concise expression. Preserves original functionality. --------- Signed-off-by: Shaik Raza Sikander <[email protected]> Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: Max Kovalenko <[email protected]> Signed-off-by: inkcherry <[email protected]> Signed-off-by: shaomin <[email protected]> Signed-off-by: Stas Bekman <[email protected]> Signed-off-by: siqi <[email protected]> Signed-off-by: Logan Adams <[email protected]> Signed-off-by: Wei Wu <[email protected]> Signed-off-by: ShellyNR <[email protected]> Signed-off-by: Lai, Yejing <[email protected]> Signed-off-by: Hongwei <[email protected]> Signed-off-by: Liang Cheng <[email protected]> Signed-off-by: A-transformer <[email protected]> Co-authored-by: Raza Sikander <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Max Kovalenko <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: inkcherry <[email protected]> Co-authored-by: wukong1992 <[email protected]> Co-authored-by: shaomin <[email protected]> Co-authored-by: Hongwei Chen <[email protected]> Co-authored-by: loadams <[email protected]> Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: siqi654321 <[email protected]> Co-authored-by: siqi <[email protected]> Co-authored-by: Wei Wu <[email protected]> Co-authored-by: Masahiro Tanaka <[email protected]> Co-authored-by: Shelly Nahir <[email protected]> Co-authored-by: snahir <[email protected]> Co-authored-by: Yejing-Lai <[email protected]> Co-authored-by: A-transformer <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Unpin transformers version for all workflows except `nv-torch-latest-v100` as this still has a tolerance issue with some quantization tests. Signed-off-by: Logan Adams <[email protected]>
Resolves #6997 This PR conditionally quotes environment variable values—only wrapping those containing special characters (like parentheses) that could trigger bash errors. Safe values remain unquoted. --------- Signed-off-by: Saurabh <[email protected]> Signed-off-by: Saurabh Koshatwar <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Correct the BACKWARD_PREFETCH_SUBMIT mismatch FORWARD_PREFETCH_SUBMIT = 'forward_prefetch_submit' --------- Signed-off-by: Shaik Raza Sikander <[email protected]> Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: Max Kovalenko <[email protected]> Signed-off-by: inkcherry <[email protected]> Signed-off-by: shaomin <[email protected]> Signed-off-by: Stas Bekman <[email protected]> Signed-off-by: siqi <[email protected]> Signed-off-by: Logan Adams <[email protected]> Signed-off-by: Wei Wu <[email protected]> Signed-off-by: ShellyNR <[email protected]> Signed-off-by: Lai, Yejing <[email protected]> Signed-off-by: Hongwei <[email protected]> Signed-off-by: A-transformer <[email protected]> Co-authored-by: Raza Sikander <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Max Kovalenko <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: inkcherry <[email protected]> Co-authored-by: wukong1992 <[email protected]> Co-authored-by: shaomin <[email protected]> Co-authored-by: Hongwei Chen <[email protected]> Co-authored-by: loadams <[email protected]> Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: siqi654321 <[email protected]> Co-authored-by: siqi <[email protected]> Co-authored-by: Wei Wu <[email protected]> Co-authored-by: Masahiro Tanaka <[email protected]> Co-authored-by: Shelly Nahir <[email protected]> Co-authored-by: snahir <[email protected]> Co-authored-by: Yejing-Lai <[email protected]> Signed-off-by: Logan Adams <[email protected]>
…Tests (#7146) Enhancing ci/nightly coverage for gaudi2 device Tests added : test_autotp_training.py test_ulysses.py test_linear::TestLoRALinear and test_linear::TestBasicLinear test_ctx::TestEngine these provide coverage for model_parallesim and linear feature. The tests are stable. 10/10 runs pass. New tests addition is expected to increase ci time by 3-4 mins and nightly job time by 15 min. Signed-off-by: Shaik Raza Sikander <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Changes from huggingface/transformers#36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Masahiro Tanaka <[email protected]> Signed-off-by: Logan Adams <[email protected]>
@tjruwase Don't merge yet, I will leave a comment when it is ready for merge. Thank you. --------- Signed-off-by: Olatunji Ruwase <[email protected]> Signed-off-by: inkcherry <[email protected]> Signed-off-by: Logan Adams <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
) This PR is a continuation of the efforts to improve DeepSpeed performance when using PyTorch compile. Dynamo breaks the graph because `flat_tensor.requires_grad = False`: * Is a side-effecting operation on tensor metadata * Occurs in a context where Dynamo expects static tensor properties for tracing `flat_tensor.requires_grad` is redundant and can be safely removed because: * `_allgather_params()` function is already decorated with `@torch.no_grad()` which ensures the desired property * `flat_tensor` is created using the `torch.empty()` which sets the `requires_grad=False` by default. --------- Signed-off-by: Max Kovalenko <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Hongwei Chen <[email protected]> Signed-off-by: Logan Adams <[email protected]>
ZeRO3 requires explicit cleaning in tests when reusing the environment. This PR adds `destroy` calls to the tests to free memory and avoid potential errors due to memory leaks. Signed-off-by: Masahiro Tanaka <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: c8ef <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Hongwei <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Logan Adams <[email protected]>
705edb3    to
    31ec2b7      
    Compare
  
    Signed-off-by: Logan Adams <[email protected]>
Signed-off-by: Logan Adams <[email protected]>
| "setuptools>=64", | ||
| "torch", | ||
| "wheel" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you depend on setuptools 70.1 or later, you won't need wheel.
| "setuptools>=64", | |
| "torch", | |
| "wheel" | |
| "setuptools>=70.1", | |
| "torch" | 
Successfully built deepspeed-0.16.5+1d869d1f-cp311-cp311-win_amd64.whl--no-build-isolationpython -m build-Successfully built deepspeed-0.16.5+1d869d1f.tar.gz and deepspeed-0.16.5+unknown-py3-none-any.whlThe main goal of this effort is to become compliant with the coming changes to pip in 25.1 listed here which will break editable installs. Future PRs will fully move from
setup.pytopyproject.tomlFixes: #7031
MII equivalent PR: deepspeedai/DeepSpeed-MII#555
DS-Kernels equivalent PR: deepspeedai/DeepSpeed-Kernels#20