Skip to content

[NeurIPS 2025] Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation

Notifications You must be signed in to change notification settings

AIoT-Lab-AI4LIFE/ViPET-ReportGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation

Our code base includes folowing stages:

  1. Finetune Vision Encoder
  • Finetune CTViT model on Vietnamese PET/CT-report dataset. Details can be found in pet-clip/README.md
  • Finetune Cosmos model on Vietnamese PET/CT-report dataset. Details can be found in Cosmos/README.md
  1. Training and Inference VLMs
  • Training and Inference Vision-Language model. Details can be found in VLMs/README.md
  1. Clinical Evaluation
  • Extract structured lesion information from the LLM output and clinically evaluate the predictions. Details can be found in clinical_evaluation/README.md

About

[NeurIPS 2025] Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published