Skip to content

Conversation

ananthsub
Copy link
Contributor

fix #838
fix #837

  • Adds HuggingFace chat template support through use_hf_tokenizer_chat_template
  • Tool calling & function schemas
  • OpenAI messages format conversion
  • Packed sequence support with chat templates
  • Pass dataset_kwargs through packing pipeline
  • update old copy/pasted docstrings & typehints

based heavily on:

@ananthsub ananthsub self-assigned this Oct 3, 2025
Copy link

copy-pr-bot bot commented Oct 3, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ananthsub
Copy link
Contributor Author

/ok to test 777dd78

@ananthsub
Copy link
Contributor Author

/ok to test b908582

Signed-off-by: Ananth Subramaniam <[email protected]>
Signed-off-by: Ananth Subramaniam <[email protected]>
Signed-off-by: Ananth Subramaniam <[email protected]>
Signed-off-by: Ananth Subramaniam <[email protected]>
@ananthsub ananthsub force-pushed the tool-calling-dataset branch from b908582 to 84eacbc Compare October 3, 2025 17:55
@ananthsub
Copy link
Contributor Author

/ok to test 84eacbc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chat dataset support missing from MegatronBridge Tool calling support missing from MegatronBridge Datasets
1 participant