Skip to content

Commit 4d872c0

Browse files
authored
Merge pull request #11 from llm-jp/work/revise_max_model_len
revise default max_model_len
2 parents e4870dc + 4e5c114 commit 4d872c0

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

inference-modules/vllm/schemas.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ class ModelConfig(BaseModel):
1414
gpu_memory_utilization: float = 0.9
1515
tensor_parallel_size: int = 1
1616
pipeline_parallel_size: int = 1
17-
max_model_len: int = 2048
17+
max_model_len: int = 4096
1818
num_scheduler_steps: int = 8
1919
enable_prefix_caching: bool = True
2020

0 commit comments

Comments
 (0)