Skip to content

Commit 1b3d37d

Browse files
committed
misc
1 parent 64e4cec commit 1b3d37d

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

torchtitan/models/deepseek_v3/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,8 @@
7575
qk_rope_head_dim=64,
7676
v_head_dim=128,
7777
mscale=0.70,
78+
use_flex_attn=True,
79+
attn_mask_type="block_causal",
7880
),
7981
"16B": DeepSeekV3ModelArgs(
8082
vocab_size=102400,

0 commit comments

Comments
 (0)