vllm-project / llm-compressor Public

Notifications You must be signed in to change notification settings
Fork 213
Star 1.8k

Code
Issues 49
Pull requests 34
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Pull requests: vllm-project/llm-compressor

Labels 11 Milestones 0

New pull request New

34 Open 754 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix wonky dependency range on datasets

#1774 opened Aug 22, 2025 by timkpaine

Loading…

[Multi-modifier] Support scoped appliation of quantization config/status

#1772 opened Aug 21, 2025 by brian-dellabetta • Draft

3 of 6 tasks

[MoE] Llama4 and More tests

#1760 opened Aug 19, 2025 by kylesayrs • Draft

[Tests] Add recovery-based validation to LM-Eval tests

#1750 opened Aug 18, 2025 by rahul-tuli • Draft

2 of 7 tasks

[Transform] SpinQuant R4

#1746 opened Aug 18, 2025 by kylesayrs • Draft

[bugfix] Fix indentation errors in the README file

#1737 opened Aug 15, 2025 by qibaoyuan

Loading…

Enable xpu device

#1736 opened Aug 15, 2025 by jiqing-feng

Loading…

[Utils] Offloaded cache size

#1714 opened Aug 7, 2025 by kylesayrs

Loading…

[Tracing] Decouple vision tower from first layer ready

When a PR is ready for review

#1710 opened Aug 6, 2025 by kylesayrs

Loading…

[WIP] [MoE] GPT OSS

#1705 opened Aug 5, 2025 by kylesayrs • Draft

[MoE] Add conditional expert calibration

#1701 opened Aug 1, 2025 by dichn • Draft

[Example] [VLM] Gemma3n

#1696 opened Jul 31, 2025 by kylesayrs • Draft

[Autowrapper] Support Gemma3, autowrapper improvements

#1693 opened Jul 30, 2025 by kylesayrs

Loading…

1686 Logic matching refactor

#1687 opened Jul 28, 2025 by ved1beta

Loading…

[AWQ] Allow for activation quantization

#1682 opened Jul 24, 2025 by brian-dellabetta • Draft

add quantization_w4a4_fp4 qwen3 example

#1681 opened Jul 24, 2025 by wangwenmingaa

Loading…

[KV Cache] support kv cache int8 per channel quantization ready

When a PR is ready for review

#1663 opened Jul 19, 2025 by Eviannn

Loading…

[Transform] Online Rotations

#1651 opened Jul 16, 2025 by kylesayrs • Draft

[Tests] Spinquant dummy model tests

#1647 opened Jul 15, 2025 by kylesayrs • Draft

Minor speedup for infer_quantization_format when save_compressed=False

#1636 opened Jul 10, 2025 by kylesayrs

Loading…

[WIP] [Research] Attention quantization and transformation

#1612 opened Jul 1, 2025 by kylesayrs • Draft

[Pipelines] Add propagate_error argument ready

When a PR is ready for review

#1575 opened Jun 20, 2025 by kylesayrs • Draft

[GPTQ] Use torch.compile to speed up gptq algo ready

When a PR is ready for review

#1561 opened Jun 17, 2025 by aladerran

Loading…

Disable sequential_targets from modifiers ready

When a PR is ready for review

#1559 opened Jun 16, 2025 by kylesayrs • Draft

[Performance] Parallelize modifier compression

#1558 opened Jun 16, 2025 by kylesayrs • Draft

Previous 1 2 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!