-
Notifications
You must be signed in to change notification settings - Fork 509
Pull requests: pytorch/torchtitan
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Feat] Gradient sync is turned off during gradient accumulation.
CLA Signed
This label is managed by the Meta Open Source bot.
#1710
opened Sep 14, 2025 by
EquationWalker
Loading…
[Do not merge] Reproduce AC(FSDP(moe.experts)) composibility issue
CLA Signed
This label is managed by the Meta Open Source bot.
[CP][RFC] Enable FlexCP for llama3 with parallelize_module
CLA Signed
This label is managed by the Meta Open Source bot.
#1707
opened Sep 12, 2025 by
fegin
Loading…
[WIP] async_tp shape mismatch in rs+mm repro
CLA Signed
This label is managed by the Meta Open Source bot.
#1705
opened Sep 12, 2025 by
IvanKobzarev
Loading…
temp fix state dict loading: avoid cache_state_dict
CLA Signed
This label is managed by the Meta Open Source bot.
[mxfp8 moe training] add torchao MXFP8 MoE training integration; bump version guard
CLA Signed
This label is managed by the Meta Open Source bot.
#1701
opened Sep 12, 2025 by
danielvegamyhre
•
Draft
[CP][RFC] Enable FlexCP for llama3 with function wrapper
CLA Signed
This label is managed by the Meta Open Source bot.
#1696
opened Sep 10, 2025 by
fegin
Loading…
grouped expert and shared expert in same graph
CLA Signed
This label is managed by the Meta Open Source bot.
#1693
opened Sep 10, 2025 by
bobrenjc93
•
Draft
[WIP][DSV3] Offload dequantization process to DCP QuantizedHFReader
CLA Signed
This label is managed by the Meta Open Source bot.
[Qwen3] Add 32b training configs
CLA Signed
This label is managed by the Meta Open Source bot.
#1690
opened Sep 8, 2025 by
wwwjn
Loading…
[ignore] expert_overlap_compile suggested fixes
CLA Signed
This label is managed by the Meta Open Source bot.
#1687
opened Sep 8, 2025 by
bobrenjc93
Loading…
Separate SAC Wrapping of MoE and Attention Modules to Enable Flex Attention Compilation
CLA Signed
This label is managed by the Meta Open Source bot.
#1683
opened Sep 5, 2025 by
fegin
Loading…
Fake balanced routing in MoE
CLA Signed
This label is managed by the Meta Open Source bot.
#1670
opened Sep 1, 2025 by
rakkit
Loading…
Use new DeviceMesh unflatten to rewrite parallel_dims
CLA Signed
This label is managed by the Meta Open Source bot.
Support llama3 autoparallel + pipelining
CLA Signed
This label is managed by the Meta Open Source bot.
#1657
opened Aug 28, 2025 by
wconstab
Loading…
code refactor : making key steps modular train_step()
CLA Signed
This label is managed by the Meta Open Source bot.
fb-exported
#1650
opened Aug 28, 2025 by
Shagun-G
Loading…
[RFC] Support full bf16 training
CLA Signed
This label is managed by the Meta Open Source bot.
#1646
opened Aug 27, 2025 by
ebsmothers
Loading…
[WIP] DCP: Dequantization and expert grouping for DSv3
CLA Signed
This label is managed by the Meta Open Source bot.
[DO NOT REVIEW] debug fsdp2 checkpoint for uneven sharding
CLA Signed
This label is managed by the Meta Open Source bot.
add option to use synthetic input data
CLA Signed
This label is managed by the Meta Open Source bot.
#1632
opened Aug 25, 2025 by
alfuyao1986
Loading…
Distributed Scion/Muon
CLA Signed
This label is managed by the Meta Open Source bot.
#1630
opened Aug 25, 2025 by
rakkit
Loading…
allow expert_parallel wrapper to handel kwargs
CLA Signed
This label is managed by the Meta Open Source bot.
#1620
opened Aug 22, 2025 by
rakkit
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.