-
Notifications
You must be signed in to change notification settings - Fork 305
Open
Labels
type:featureNew feature or requestNew feature or request
Description
I am interested in contributing to Keras by implementing DINOv3 (Distillation with No labels v3), a state-of-the-art self-supervised Vision Transformer model, as an example/tutorial. Before proceeding, I would like to confirm if this aligns with the project's goals and if there are any existing implementations or guidelines I should be aware of.
Why DINOv3?
- State-of-the-art performance: DINOv3 achieves top-tier results on various vision tasks without requiring labeled data, making it a valuable addition to Keras examples.
- Versatility: It serves as a strong backbone for tasks like image classification, segmentation, and object detection.
- Alignment with Keras 3: Given Keras 3's multi-backend support (TensorFlow, JAX, PyTorch), implementing DINOv3 would showcase the framework's flexibility.
Implementation Plan:
- Model Architecture: Implement the Vision Transformer (ViT) backbone with self-supervised learning using DINOv3.
- Training: Utilize standard datasets such as CIFAR-10 or ImageNet for training.
- Backend Compatibility: Ensure the implementation is compatible with TensorFlow, JAX, and PyTorch backends.
- Documentation: Provide clear instructions on how to use the model, including training and evaluation scripts.
Metadata
Metadata
Assignees
Labels
type:featureNew feature or requestNew feature or request