Skip to content

Conversation

nkatyal
Copy link
Contributor

@nkatyal nkatyal commented Apr 15, 2025

The following changes introduces a newer version of the training script that uses HF accelerate. The functionality remains the same while making it easier for the user to set up the training environment.
The changes has been tested in the following environments

🔍 Model Training Comparison (Old vs. New Setup)

Model GPUs Training Time (Old / New) MRR (Old / New)
BERT Short 1 × NVIDIA L4 (24GB) 3h 30m / 3h 22m 0.3319 / 0.3401
ElectraBERT MAXP 4 × NVIDIA T4 (16GB each) 6h 15m / 6h 30m 0.3364 / 0.3428

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant