Are there fine-tuning and inference scripts available for int4 quantization in bloom-7b? Is it possible to limit the GPU memory usage to within 10GB?

Where can I download bloom-7b?
I noticed that int8 quantization is available, but is there an option for int4 quantization?
What is the memory overhead for int4 and int8 when using LoRA or PTuning fine-tuning? Are there any fine-tuning scripts available?
Additionally, are there inference scripts available for int4 quantization? How much GPU memory is required for int4 and int8 inference, respectively?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Are there fine-tuning and inference scripts available for int4 quantization in bloom-7b? Is it possible to limit the GPU memory usage to within 10GB? #94

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Are there fine-tuning and inference scripts available for int4 quantization in bloom-7b? Is it possible to limit the GPU memory usage to within 10GB? #94

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions