Paper: https://arxiv.org/abs/2412.06014
Project page: https://aaltoml.github.io/BayesVLM/
- Ensure you have Python version 
>= 3.11installed. - Install the required packages by running:
pip install -r requirements.txt
 - Set 
DATA_BASE_DIRin your.envfile. You can use the structure from the.env.examplefile.DATA_BASE_DIR=/path/to/datasets - Add the project root directory to the 
PYTHONPATHenvironment variable.export PYTHONPATH=$PYTHONPATH:/path/to/project/root
 - (Optional) If you use a M1 Mac with 
mpssupport you'll need to set the following environment variable:export PYTORCH_ENABLE_MPS_FALLBACK=1 
To run the hessian estimation code, use the following command:
python scripts/hessian_estimation.pyTo run the code for zero-shot experiments, use the following command:
python scripts/zeroshot.pyTo run the code for the active-learning experiments, use the following command:
python scripts/activelearning.pyNote that each of those commands has additional arguments that allow the adjustment of the Hessian estimation and zero-shot/active learning experiments.
The precomputed Hessians for the models used in the paper are available in the hessians/ folder. You can select a specific hessian by setting --hessian_dir in the provided scripts.
A notebook stepping through the zero-shot code is available in notebooks/zeroshot.ipynb.
The data is stored in the DATA_BASE_DIR folder and is structured as follows:
DATA_BASE_DIR/
├── cifar10/
├── cifar100/
├── eurosat/
├── flowers102/
├── food101/
├── homeoffice/
├── imagenet_r/
├── imagenet_val_wds/
├── laion400m/
├── sun397/
├── ucf101/Please set the DATA_BASE_DIR environment variable accordingly.
The CIFAR-10 dataset is automatically downloaded by the huggingface datasets library.
The CIFAR-100 dataset is automatically downloaded by the huggingface datasets library.
From https://github.com/vishaal27/SuS-X/blob/main/data/DATA.md
- Create a folder named 
eurosat/underDATA_BASE_DIR. - Download the dataset from http://madm.dfki.de/files/sentinel/EuroSAT.zip and extract it to 
DATA_BASE_DIR/eurosat/. - Download 
split_zhou_EuroSAT.jsonfrom here and put it underDATA_BASE_DIR/eurosat. 
The directory structure should look like
eurosat/
|–– 2750/
|–– split_zhou_EuroSAT.json
The Flowers102 dataset is automatically downloaded by the torchvision library.
The Food101 dataset is automatically downloaded by the torchvision library.
Download the dataset from https://www.hemanthdv.org/officeHomeDataset.html and extract it to DATA_BASE_DIR/homeoffice/.
The directory structure should look like
homeoffice/
|–– Art/
|–– Clipart/
|–– Product/
|–– Real World/
|–– ImageInfo.csv
|–– imagelist.txt
Follow the instructions pytorch/vision#7545 (comment) to download the dataset and extract it to DATA_BASE_DIR/stanford_cars/.
The DTD dataset is automatically downloaded by the torchvision library.
We supply the script scripts/download_imagenet.py to download all validation tar files for the ImageNet dataset from the Hugging Face Datasets Hub.
After running the script, the directory structure should look like
imagenet_val_wds/
|–– imagenet1k-validation-00.tar
|–– imagenet1k-validation-01.tar
|–– ...
|–– imagenet1k-validation-63.tar
Download the dataset from https://github.com/hendrycks/imagenet-r and extract it to ./data/imagenet-r/.
The laion400M dataset can be downloaded using the img2dataset tool. The instructions for the laion400m dataset are available here.
Before running the img2dataset script, we removed all data points marked as NSFW in the metadata.
- Create a folder named  
sun397/under./data. - Download the images http://vision.princeton.edu/projects/2010/SUN/SUN397.tar.gz.
 - Download the partitions https://vision.princeton.edu/projects/2010/SUN/download/Partitions.zip.
 - Extract these files under 
./data/sun397/. - Download 
split_zhou_SUN397.jsonfrom this link and put it under./data/sun397. 
The directory structure should look like
sun397/
|–– SUN397/
|–– split_zhou_SUN397.json
|–– ... # a bunch of .txt files
- Create a folder named 
ucf101/under./data. - Download the zip file 
UCF-101-midframes.zipfrom here and extract it to./data/ucf101/. This zip file contains the extracted middle video frames. - Download 
split_zhou_UCF101.jsonfrom this link and put it under./data/ucf101. 
The directory structure should look like
ucf101/
|–– UCF-101-midframes/
|–– split_zhou_UCF101.json
@article{baumann2024bayesvlm,
  title = {Post-hoc Probabilistic Vision-Language Models},
  author = {Anton Baumann, Rui Li, Marcus Klasson, Santeri Mentu, Shyamgopal Karthik, Zeynep Akata, Arno Solin and Martin Trapp},
  year = {2024},
  journal = {arXiv preprint arxiv:2412.06014}
}This software is provided under the MIT license.
