Skip to content

Conversation

garciadias
Copy link
Contributor

Solving issue: #389

…t.toml for compatibility

Signed-off-by: R. Garcia-Dias <[email protected]>
pyproject.toml Outdated
Comment on lines 43 to 58
"nvidia-cublas-cu12>=12.9.1.4",
"nvidia-cuda-cupti-cu12>=12.9.79",
"nvidia-cuda-nvrtc-cu12>=12.9.86",
"nvidia-cuda-runtime-cu12>=12.9.79",
"nvidia-cudnn-cu12>=9.10.2.21",
"nvidia-cufft-cu12>=11.4.1.4",
"nvidia-cufile-cu12>=1.14.1.1",
"nvidia-curand-cu12>=10.3.10.19",
"nvidia-cusolver-cu12>=11.7.5.82",
"nvidia-cusparse-cu12>=12.5.10.65",
"nvidia-cusparselt-cu12>=0.7.1",
"nvidia-nccl-cu12>=2.27.3",
"nvidia-nvjitlink-cu12>=12.9.86",
"nvidia-nvtx-cu12>=12.9.79",
"torchvision>=0.23.0",
"triton>=3.4.0",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need these btw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gshpychka, thank you for looking into these.
You are right. These are not necessary. These are the ones changed by running the instructions from the PyTorch documentation. I have removed them and rebuilt the container, and as you predicted, they were not necessary. The Nvidia stuff will be updated in the background, but they do not need to be explicitly defined. Torchvision is not necessary.

@@ -1,4 +1,4 @@
FROM --platform=$BUILDPLATFORM nvidia/cuda:12.8.1-base-ubuntu24.04
FROM --platform=$BUILDPLATFORM nvcr.io/nvidia/cuda:12.9.1-cudnn-devel-ubuntu24.04

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for switching from base to cuddn-devel?

i.e. why not cuda:12.9.1-base-ubuntu24.04?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is an incompatibility between Spacy and CMAKE libraries configuration on the base image:

180.2  Downloading spacy
181.4   × Failed to build `pyopenjtalk==0.4.1`
181.4   ├─▶ The build backend returned an error
181.4   ╰─▶ Call to `setuptools.build_meta.build_wheel` failed (exit status: 1)
181.4 
181.4       [stdout]
181.4       -- Configuring incomplete, errors occurred!
181.4 
181.4       [stderr]
181.4       CMake Error: CMake was unable to find a build program corresponding to
181.4       "Unix Makefiles".  CMAKE_MAKE_PROGRAM is not set.  You probably need to
181.4       select a different build tool.
181.4       CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
181.4       CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
181.4       Traceback (most recent call last):
181.4         File "<string>", line 14, in <module>
181.4         File
181.4       "/tmp/.tmpTbGGdY/builds-v0/.tmpobis9A/lib/python3.10/site-packages/setuptools/build_meta.py",
181.4       line 331, in get_requires_for_build_wheel
181.4           return self._get_build_requires(config_settings, requirements=[])
181.4         File
181.4       "/tmp/.tmpTbGGdY/builds-v0/.tmpobis9A/lib/python3.10/site-packages/setuptools/build_meta.py",
181.4       line 301, in _get_build_requires
181.4           self.run_setup()
181.4         File
181.4       "/tmp/.tmpTbGGdY/builds-v0/.tmpobis9A/lib/python3.10/site-packages/setuptools/build_meta.py",
181.4       line 317, in run_setup
181.4           exec(code, locals())
181.4         File "<string>", line 105, in <module>
181.4         File
181.4       "/home/appuser/.local/share/uv/python/cpython-3.10.18-linux-x86_64-gnu/lib/python3.10/subprocess.py",
181.4       line 457, in check_returncode
181.4           raise CalledProcessError(self.returncode, self.args, self.stdout,
181.4       subprocess.CalledProcessError: Command '['cmake', '..']' returned
181.4       non-zero exit status 1.
181.4 
181.4       hint: This usually indicates a problem with the package or the build
181.4       environment.
181.4   help: `pyopenjtalk` (v0.4.1) was included because `kokoro-fastapi` (v0.3.0)
181.4         depends on `misaki[ja]` (v0.9.4) which depends on `pyopenjtalk`

@rushyrush
Copy link

rushyrush commented Oct 2, 2025

It appears that this is not longer being maintained. I need 5000 series support too, so @garciadias I forked your changes and got Ci online.
The image is available at ghcr.io/rushyrush/kokoro-fastapi-gpu:v0.3.0Sucessfully tested on a 4x5000 series cluster.

@remsky remsky merged commit 88dcf00 into remsky:master Oct 2, 2025
@remsky
Copy link
Owner

remsky commented Oct 2, 2025

Heya, sorry I've been absent. Have merged in currently, and working through the backlog 😅

@rushyrush
Copy link

rushyrush commented Oct 2, 2025

@remsky Don't apologize, glad you are back! I just couldn't wait anymore for 5000 support and progress looked stalled. Thanks for getting this merged! 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants