How long does it take to train a sequene_parallel bert? #1365
              
                Unanswered
              
          
                  
                    
                      tianboh
                    
                  
                
                  asked this question in
                Community | Q&A
              
            Replies: 0 comments
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I am following the document to train in the docker environment.
I use the default config as
I am using 4 V100 GPUs to train this model. However, I only have done ~2400 iterations after an hour of training. There are 1000000 iterations in total, so it seems that it takes ~17 days to finish training. Is this normal? Or do I need to reduce training iterations?
Beta Was this translation helpful? Give feedback.
All reactions