Skip to content

Conversation

cem-sirin
Copy link

Summary

Added modeling support for ColIdefics3 (supports the ColSmol collection, i.e. the SmolVLM model family).

Implementation Details

  • ColIdefics3 model with LoRA adapter support
  • BaseColVisionProcessor for multi-vector retrieval
  • Working example with vidore/colSmol-256M

Discussion Points & Feedback Needed

1. File naming convention

The model_type values in HuggingFace are inline with the VLMs (e.g., idefics3, not colidefics3). With the current setup we either need to:

  • Change file names to match HuggingFace model names, or
  • Adjust the loading code

2. Processor loading

The processors don't load correctly with HF's AutoProcessor, so I added a BaseColVisionProcessor. Should we:

  • Make this a more general ColVisionProcessor?
  • Integrate with existing processor patterns?

3. ModelArgs convention

I used class inheritance for ModelArgs (different from ColQwen2_5). Would prefer feedback on the preferred pattern.

4. Handling Adapters

I didn't know where to handle LoRA adapters, I just left it in the file modeling, ofcourse it is not the place to do it. It's important cause most ColVision models use adapters.

Recommendation

IMO, this codebase could become a drop-in replacement for colpali_engine if we standardize the ColVision model patterns.

Testing

Tested with vidore/colSmol-256M - works for both text and image embeddings with multi-vector scoring.

Yeah overall excited to hear feedback @Blaizzy!

- Implement ColIdefics3 model with adapter loading
- Add BaseColVisionProcessor for multi-vector scoring
- Include working example with vidore/colSmol-256M model
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now let's merge this into colidefics3.py

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets remove this test and add tests in the tests folder.

Copy link
Owner

@Blaizzy Blaizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this awesome PR!

I'm sorry took so long

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants