Skip to content

Conversation

Edenzzzz
Copy link
Collaborator

@Edenzzzz Edenzzzz commented Jun 29, 2025

After #594, encoder offload is enabled by default, so TP will almost always be slower than offload + layer-wise prefetch. This PR sets TP size to 1 for training and inference by default.

I will need to spend some time finishing up the LoRA features, so please ensure this doesn't break any training or inference scripts.

cc @BrianChen1129 @SolitaryThinker

@Edenzzzz Edenzzzz temporarily deployed to runpod-runners June 29, 2025 00:18 — with GitHub Actions Inactive
@Edenzzzz Edenzzzz temporarily deployed to runpod-runners June 29, 2025 00:18 — with GitHub Actions Inactive
@SolitaryThinker SolitaryThinker added the go Trigger Buildkite CI label Jun 30, 2025
@Edenzzzz Edenzzzz force-pushed the wenxuan/deprecate_tp branch from 8206640 to 53a6f4a Compare July 5, 2025 01:08
@Edenzzzz Edenzzzz merged commit 65ed588 into main Jul 9, 2025
3 of 4 checks passed
@Edenzzzz Edenzzzz deleted the wenxuan/deprecate_tp branch July 9, 2025 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go Trigger Buildkite CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants