Skip to content

Conversation

@zhxchen17
Copy link

Summary:
Replacing the API usage while removing some dead code.

Test Plan:

NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none

NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none --model.flavor=debugmodel_flex_attn

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4 --model.flavor=debugmodel_flex_attn

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 3, 2025
@zhxchen17 zhxchen17 marked this pull request as draft November 3, 2025 20:10
@zhxchen17
Copy link
Author

Will wait for the nightly build to catch up

@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch 2 times, most recently from 314f20a to 47cc583 Compare November 5, 2025 14:11
@zhxchen17 zhxchen17 marked this pull request as ready for review November 5, 2025 14:13
@zhxchen17
Copy link
Author

cc @yiming0416

@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch from 47cc583 to 70897e3 Compare November 5, 2025 15:19
@zhxchen17 zhxchen17 marked this pull request as draft November 5, 2025 15:43
Summary:
Replacing the API usage while removing some dead code.

Test Plan:
```
NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none

NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none --model.flavor=debugmodel_flex_attn

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4

NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4 --model.flavor=debugmodel_flex_attn
```
@zhxchen17 zhxchen17 force-pushed the zhxchen17/update_export_api branch 2 times, most recently from 751702a to 5f8af78 Compare November 5, 2025 15:48
@zhxchen17 zhxchen17 marked this pull request as ready for review November 5, 2025 15:48
@zhxchen17
Copy link
Author

Test failure doesn't repro with pytorch latest main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant