Skip to content

Conversation

johnnynunez
Copy link
Contributor

Thor and Spark

@MasterJH5574
Copy link
Member

Hi @johnnynunez, thanks for contributing! We found that the change in https://github.com/apache/tvm/pull/18300/files is the only one we need for CUDA 13. Everything else is compatible with CUDA 13, including flashinfer-python.

We have removed the AOT flashinfer support a while ago, so the cmake config of USE_FLASHINFER has no actual effect. I can help take over the PR and clean up those config in the codebase.

@johnnynunez
Copy link
Contributor Author

johnnynunez commented Sep 10, 2025

Hi @johnnynunez, thanks for contributing! We found that the change in https://github.com/apache/tvm/pull/18300/files is the only one we need for CUDA 13. Everything else is compatible with CUDA 13, including flashinfer-python.

We have removed the AOT flashinfer support a while ago, so the cmake config of USE_FLASHINFER has no actual effect. I can help take over the PR and clean up those config in the codebase.

Thanks! It is because i was using thor and spark and it is compatible with cuda 13 about flashinfer with that version

@MasterJH5574
Copy link
Member

It is because i was using thor and spark and it is compatible with cuda 13 about flashinfer with that version

@johnnynunez Got it. We can bump it then. I am currently working on shipping our CUDA 13 python package, and will update this PR after finishing that.

@johnnynunez
Copy link
Contributor Author

It is because i was using thor and spark and it is compatible with cuda 13 about flashinfer with that version

@johnnynunez Got it. We can bump it then. I am currently working on shipping our CUDA 13 python package, and will update this PR after finishing that.

Thank you! It is the baseline for those devices

@MasterJH5574 MasterJH5574 merged commit a690e94 into mlc-ai:main Sep 15, 2025
1 check was pending
@MasterJH5574
Copy link
Member

Hi @johnnynunez, I just updated this PR and got it merged. Unfortunately FlashInfer 0.3.1 doesn't work with the latest main tvm and mlc-llm because of our recent rename of ffi::NDArray to ffi::Tensor. We have updated FlashInfer side accordingly flashinfer-ai/flashinfer@0828553, and expect to have it included in the next FlashInfer release.

For now if you want to use FlashInfer, you will need to clone the FlashInfer GitHub repo and build it from source by following https://docs.flashinfer.ai/installation.html#python-package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants