ZSLViT

This repository contains the training and testing code for the CVPR'24 paper titled with "Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning".

Requirements

The code implementation of ZSLViT mainly based on PyTorch. All of our experiments run and test in Python 3.9.7. To install all required dependencies:

$ pip install -r requirements.txt

Preparing Dataset and Model

We provide trained models (Google Drive) on three different datasets: CUB, SUN, AWA2 in the CZSL/GZSL setting. You can download model files as well as corresponding datasets, and organize them as follows:

.
├── saved_model
│   ├── ZSLViT_CUB_CZSL.pth
│   ├── ZSLViT_CUB_GZSL.pth
│   ├── ZSLViT_SUN_CZSL.pth
│   ├── ZSLViT_SUN_GZSL.pth
│   ├── ZSLViT_AWA2_CZSL.pth
│   └── ZSLViT_AWA2_GZSL.pth
├── data
│   ├── CUB/
│   ├── SUN/
│   └── AWA2/
└── ···

Train

Runing following commands and training ZSLViT:

Need to modify the wandb_config file.

$ python train.py

Test

Runing following commands and testing ZSLViT on different dataset:

Need to modify the wandb_config file, gzsl is True or False.

CUB Dataset:

$ python test_CUB.py      # CZSL Setting and GZSL Setting

SUN Dataset:

$ python test_SUN.py      # CZSL Setting and GZSL Setting

AWA2 Dataset:

$ python test_AWA2.py     # CZSL Setting and GZSL Setting

Results

Results of our released models using various evaluation protocols on three datasets, both in the conventional ZSL (CZSL) and generalized ZSL (GZSL) settings.

Dataset	Acc(CZSL)	U(GZSL)	S(GZSL)	H(GZSL)
CUB	78.9	69.4	78.2	73.6
SUN	68.3	45.9	48.4	47.3
AWA2	70.7	66.1	84.6	74.2

Note: We perform experiments on a single NVIDIA Tesla V100 graphic card with 32GB memory.

Acknowledgement ❤️

This project is based on EViT (paper) and ViT-ZSL(paper). Thanks for their wonderful works.

Citation

If you find ZSLViT is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@inproceedings{chen2024progressive,
  title={Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning},
  author={Chen, Shiming and Hou, Wenjin and Khan, Salman and Khan, Fahad Shahbaz},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23964--23974},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
data		data
saved_model		saved_model
wandb_config		wandb_config
Dataset.py		Dataset.py
README.md		README.md
helpers.py		helpers.py
requirements.txt		requirements.txt
test_AWA2.py		test_AWA2.py
test_CUB.py		test_CUB.py
test_SUN.py		test_SUN.py
test_function.py		test_function.py
train.py		train.py
train_function.py		train_function.py
utils.py		utils.py
zslvit.py		zslvit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ZSLViT

Requirements

Preparing Dataset and Model

Train

Test

Results

Acknowledgement ❤️

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

shiming-chen/ZSLViT

Folders and files

Latest commit

History

Repository files navigation

ZSLViT

Requirements

Preparing Dataset and Model

Train

Test

Results

Acknowledgement ❤️

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages