Skip to content

MADLAD model

Laura Burdick edited this page Oct 13, 2025 · 4 revisions

A few notes about working with the MADLAD model.

  • Unlike NLLB, you cannot specify a source language
  • Data format: Source sentences are prepended with the target language code in the form <2xx>, where xx is the language code. Nothing is added to target sentences.
  • MADLAD uses the same set of ISO codes as NLLB (meaning they follow the same standard, not that they were trained on the same set of languages)

Update (10/10/25):

  • MADLAD can run out of memory on the A100s (jobs_backlog queue) or the 48gb H100 (half of one of the Cheetah servers). It works if you run it on the 94gb H100 (one of the full Cheetah servers).
  • MADLAD does not work with SDPA attention, which is the current default in silnlp. To use MADLAD, you must specify eager attention. Here's a sample config file:
data:
  corpus_pairs:
  - corpus_books: NT
    src: xxx-sourceBible
    test_books: MAT11-16
    trg: xxx-targetBible
    type: train,test
  lang_codes:
    xxx: xxx_Latn
    xxx: xxx_Latn
  seed: 111
model: google/madlad400-3b-mt
params:
  attn_implementation: eager
Clone this wiki locally