Add a simple weight sync sandbox #531

allenwang28 · 2025-11-06T16:59:48Z

Adding this as a development platform for weight sync optimizations.

casteryh · 2025-11-06T17:35:38Z

Do we want this to run multiple iterations, or indefinitely?

allenwang28 · 2025-11-06T17:37:48Z

Do we want this to run multiple iterations, or indefinitely?

For this sandbox, just a single run so we can time how long a single weight sync step takes

JenniferWang · 2025-11-06T17:38:56Z

I don't have objection to adding a new sandbox test -- just that I've been using this one https://fburl.com/code/8400zng6
So is it reasonable to consolidate the two apps?

casteryh · 2025-11-06T17:40:27Z

tests/sandbox/weight_sync/qwen3_1_7b.yaml

+    logging_mode: global_reduce
+
+policy:
+  prefetch_weights_to_shm: false  # Disable to avoid shared memory warnings in test


what warnings are you seeing?

It spams resource_tracking stuff saying that the shared memory files don't exist anymore. Claude couldn't figure it out so I just disabled it lol

casteryh · 2025-11-06T17:41:27Z

tests/sandbox/weight_sync/qwen3_1_7b.yaml

+# Weight Sync Sandbox Configuration
+# >>> python -m tests.sandbox.weight_sync.main --config tests/sandbox/weight_sync/qwen3_1_7b.yaml
+
+model: "Qwen/Qwen3-1.7B"


I feel like we could use a larger model like 8b

we can add more model configs as needed

casteryh · 2025-11-06T17:42:41Z

lgtm but doesn't the integration test do exactly this (+ verification)? Why do we need a separate one?

allenwang28 · 2025-11-06T18:38:44Z

So is it reasonable to consolidate the two apps?
doesn't the integration test do exactly this (+ verification)? Why do we need a separate one?

I envisioned this just as a temporary sandbox prioritizing hacking and fast iteration times. I find that developing against pytest can add overhead in logging etc. and so I'm ok with these two being separate things. The pytest is very helpful for e.g., in CI making sure that this passes consistently.

Does this separation make sense?

Allen Wang added 6 commits November 5, 2025 14:39

weight sync sandbox

404484e

some cleanups

6ee6cd6

need to do some more stuff

7fc533a

slight edit

bf83bf4

comment

b077a65

no claude code

252b2b9

allenwang28 requested review from JenniferWang and casteryh November 6, 2025 16:59

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 6, 2025

Merge branch 'main' into weight_sync

4a7281e

casteryh reviewed Nov 6, 2025

View reviewed changes

casteryh approved these changes Nov 6, 2025

View reviewed changes

allenwang28 merged commit f55bac8 into meta-pytorch:main Nov 6, 2025
10 checks passed

allenwang28 deleted the weight_sync branch November 6, 2025 19:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a simple weight sync sandbox #531

Add a simple weight sync sandbox #531

Uh oh!

allenwang28 commented Nov 6, 2025

Uh oh!

casteryh commented Nov 6, 2025

Uh oh!

allenwang28 commented Nov 6, 2025

Uh oh!

JenniferWang commented Nov 6, 2025

Uh oh!

casteryh Nov 6, 2025

Uh oh!

allenwang28 Nov 6, 2025

Uh oh!

casteryh Nov 6, 2025

Uh oh!

allenwang28 Nov 6, 2025

Uh oh!

casteryh commented Nov 6, 2025

Uh oh!

allenwang28 commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add a simple weight sync sandbox #531

Add a simple weight sync sandbox #531

Uh oh!

Conversation

allenwang28 commented Nov 6, 2025

Uh oh!

casteryh commented Nov 6, 2025

Uh oh!

allenwang28 commented Nov 6, 2025

Uh oh!

JenniferWang commented Nov 6, 2025

Uh oh!

casteryh Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

allenwang28 Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

casteryh Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

allenwang28 Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

casteryh commented Nov 6, 2025

Uh oh!

allenwang28 commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants