Skip to content

Conversation

@jveitchmichaelis
Copy link
Collaborator

@jveitchmichaelis jveitchmichaelis commented Aug 21, 2025

This PR adds support for a basic DinoV3 backbone for RetinaNet.

As this is a WIP, I've added a few improvements to the CLI for debugging and logging. Some of this I'd like to PR separately and there is a minor fix to the dataset so that it actually uses root_dir for CSVs with full image paths. I also added a config option for the log folder.

To use Comet, make COMET_API_KEY and COMET_WORKSPACE available in your environment.

Train with:

[uv run] deepforest --config-name dinov3 train

Please try to use the CLI as much as possible so we can test the user experience.

For development I'd suggest making another config file with the train/val directories set up.

defaults:
  - dinov3
  - _self_

train:
    csv_file: 
    root_dir: 


validation:
    csv_file: 
    root_dir:

This will probably fail CI because we need to add a secret to pull the weights for testing. Locally the sanity checks pass (inference + train forward).

@jveitchmichaelis jveitchmichaelis marked this pull request as draft August 21, 2025 21:38
@bw4sz bw4sz self-requested a review August 21, 2025 21:49
@jveitchmichaelis jveitchmichaelis force-pushed the dinov3 branch 2 times, most recently from 6639118 to 4996ef0 Compare August 22, 2025 20:34
@jveitchmichaelis
Copy link
Collaborator Author

jveitchmichaelis commented Aug 22, 2025

I think this is roughly the different paths we're comparing to (except we would always use COCO-pretrained ResNet to start)

flowchart TD

    %% Datasets -> Backbones
    ImageNet([ImageNet]) --> ResNet[ResNet Backbone]
    ImageNet -.-> MSCOCO([MS-COCO])
    MSCOCO --> ResNet

    Sat493M([Sat-493M]) --> Dinov3[Dinov3 Backbone]
    LVD1689M([LVD-1689M]) --> Dinov3

    %% Backbones -> Pretrained RetinaNet
    ResNet --> Baseline[Pre-Trained RetinaNet]
    Dinov3 --> Baseline

    %% Fine-tuning paths
    Baseline --> FineTuned([Hand Annotations])
    Baseline -.-> LIDAR([Weak LIDAR Supervision])
    LIDAR --> FineTuned

    %% Merge paths into evaluation
    FineTuned --> NeonTree([NeonTreeEvaluation])
Loading

@jveitchmichaelis
Copy link
Collaborator Author

jveitchmichaelis commented Aug 22, 2025

In-progress training logs can be found here: https://www.comet.com/jveitchmichaelis/deepforest/view/new/panels

To dos:

  • Evaluation for DinoV3-ViT-L (300M params)
  • Evaluation for DinoV3-ViT-7B (7B params)
  • Evaluation for ResNet50 (25M params) to confirm reproducibility of existing pipeline

Currently performing cross-evaluation for the training dataset, followed by a "holdout" run on all train + NeonTreeEval. All Dino backbones are frozen for now, but generally we fine-tune ResNet. Previous hyper-params for resnet:

  • 40 epochs
  • lr 1e-4

Also potentially different hyper-parameters for feature pooling, following conventions in ViTDet: https://arxiv.org/abs/2203.16527

@bw4sz
Copy link
Collaborator

bw4sz commented Sep 9, 2025

Related to our recent conversation, is this PR WIP or is it ready for review, i'm requested, but it still has WIP.

@jveitchmichaelis
Copy link
Collaborator Author

jveitchmichaelis commented Sep 9, 2025

You requested your review 😅 I don't anticipate many code changes here, but I would make some of the cli improvements optional (don't want to force comet on people, for example). I'd welcome a review of the model aspects at least.

As for whether it's out of WIP, do we want to go ahead and support it as a backbone now, or wait for results on the pretraining to see if it makes sense to add it as an option?

Only thing would be the relative pathing that we discussed, to support huge datasets that may be organised in subfolders.

@bw4sz
Copy link
Collaborator

bw4sz commented Sep 10, 2025

You requested your review 😅 I don't anticipate many code changes here, but I would make some of the cli improvements optional (don't want to force comet on people, for example). I'd welcome a review of the model aspects at least.

As for whether it's out of WIP, do we want to go ahead and support it as a backbone now, or wait for results on the pretraining to see if it makes sense to add it as an option?

Only thing would be the relative pathing that we discussed, to support huge datasets that may be organised in subfolders.

Do you think that the improvements you've made are broader than just this one backbone? I think so, so it should go in regardless of the pretraining adventure. What do you think? I haven't reviewed yet, I forgot I requested it, I'll remove myself, up to you to when to take WIP off. I'll wait, there are plenty of other PRs.

@jveitchmichaelis
Copy link
Collaborator Author

jveitchmichaelis commented Sep 10, 2025

I would probably include the CLI improvements. I've been trying to use that as my only training command to see if there are things it's missing. I added some sensible defaults for loggers/callbacks, output folder etc.

But otherwise the main contribution is the backbone definition and a tweak to how to select backbones for fine tuning (eg imagenet / coco).

@jveitchmichaelis jveitchmichaelis force-pushed the dinov3 branch 8 times, most recently from 8864087 to 0ed2528 Compare September 13, 2025 00:40
@jveitchmichaelis
Copy link
Collaborator Author

Going to start moving out out-of-scope changes to other PRs. The core of this one should just be the model backbone IMO.

@jveitchmichaelis jveitchmichaelis force-pushed the dinov3 branch 7 times, most recently from 6ceb099 to d856c2d Compare September 13, 2025 16:22
@bw4sz bw4sz removed their request for review October 9, 2025 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants