Add ColIdefics3 support with LoRA adapters and ColVision processor #30
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Added modeling support for ColIdefics3 (supports the ColSmol collection, i.e. the SmolVLM model family).
Implementation Details
vidore/colSmol-256M
Discussion Points & Feedback Needed
1. File naming convention
The
model_type
values in HuggingFace are inline with the VLMs (e.g.,idefics3
, notcolidefics3
). With the current setup we either need to:2. Processor loading
The processors don't load correctly with HF's
AutoProcessor
, so I added aBaseColVisionProcessor
. Should we:ColVisionProcessor
?3. ModelArgs convention
I used class inheritance for
ModelArgs
(different from ColQwen2_5). Would prefer feedback on the preferred pattern.4. Handling Adapters
I didn't know where to handle LoRA adapters, I just left it in the file modeling, ofcourse it is not the place to do it. It's important cause most ColVision models use adapters.
Recommendation
IMO, this codebase could become a drop-in replacement for colpali_engine if we standardize the ColVision model patterns.
Testing
Tested with
vidore/colSmol-256M
- works for both text and image embeddings with multi-vector scoring.Yeah overall excited to hear feedback @Blaizzy!