Skip to content

Conversation

Acly
Copy link
Collaborator

@Acly Acly commented Sep 14, 2025

Adds a CMake option GGML_VULKAN_SHADER_DEV (default: off). If enabled, makes tweaking shaders less painful:

  • reads shader SPIRV from disk instead of embedding it
  • uses ninja to build shaders for incremental build & dependency tracking

So when you make a change to a shader file and build, exactly 1 glslc invocation will run, and no C++ compilation at all. This is faster by many orders of magnitude, and will no doubt enhance vulkan backend developer efficiency by a similar factor.

…h less compilation

* instead of embedding SPIRV in the library, read them from the filesystem instead
* don't need to recompile any .cpp files when modifying shaders
* allows incremental builds
* tracks shader dependencies
* requires ninja to be installed
@Acly Acly requested a review from 0cc4m as a code owner September 14, 2025 15:22
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Sep 14, 2025
@jeffbolznv
Copy link
Collaborator

Does this have to be tied to ninja? I'd be happy to have proper incremental builds for the glsl files. Maybe we should have one generated file per glsl source file or something like that.

@Acly
Copy link
Collaborator Author

Acly commented Sep 14, 2025

Does this have to be tied to ninja?

Writing a file with a timestamp for each source file would be okay. Adding transitive dependencies from #include directives is some additional work towards reinventing ninja though.

I'm actually most worried about all the process and filesystem stuff, it's a quagmire of platform-specific weirdness. The current windows pipe handling is already broken and I'm not looking forward to adding sporadic "Access denied" issues to the list.

@jeffbolznv
Copy link
Collaborator

I don't understand why we'd need to track timestamps ourselves rather than use cmake to handle the dependencies. cmake is kind of annoying for generated source files but it's possible to make it work.

@Acly
Copy link
Collaborator Author

Acly commented Sep 18, 2025

Ah, you only mentioned without ninja, so I assumed keeping compiler invocations inside vulkan-shaders-gen as before.

It should be possible to generate a CMake instead of ninja file with custom_command & DEPFILE (last time I did this it was only possible for ninja, but looks like it supports few more generators now).

@Acly Acly force-pushed the vulkan-shader-dev branch from 8bb1ee4 to ea2dc96 Compare September 19, 2025 08:03
@Acly
Copy link
Collaborator Author

Acly commented Sep 19, 2025

Updated to use CMake.

It's still behind a cmake option atm. @jeffbolznv @0cc4m let me know if you think it's okay to use cmake as the default way to build shaders, then I can simplify by throwing out the old code.

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 21, 2025

I'm okay with always using cmake for this. This is in fact very nice to speed up shader iteration, especially with my 2nd gen EPYC I usually have to wait a long time until I can test a change. Thank you!

@jeffbolznv
Copy link
Collaborator

I'll test this out, but it isn't quite what I had in mind. Why can't this all be done as cmake logic in the main makefile - get a list of shader files and generate targets for each of them? This seems like a pretty common thing that shouldn't require generating and running a new cmake file.

@jeffbolznv
Copy link
Collaborator

Does not build for me with msvc (with GGML_VULKAN_SHADER_DEV enabled):

17>------ Rebuild All started: Project: ggml-vulkan, Configuration: RelWithDebInfo x64 ------
17>Generate vulkan shaders
17>ggml_vulkan: Generating and compiling shaders to SPIR-V
17>Building Custom Rule C:/github/jeffbolznv/llama.cpp/ggml/src/ggml-vulkan/CMakeLists.txt
17>Configure vulkan shaders
17>-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631.
17>-- The C compiler identification is MSVC 19.35.32217.1
17>-- The CXX compiler identification is MSVC 19.35.32217.1
17>-- Detecting C compiler ABI info
17>-- Detecting C compiler ABI info - done
17>-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
17>-- Detecting C compile features
17>-- Detecting C compile features - done
17>-- Detecting CXX compiler ABI info
17>-- Detecting CXX compiler ABI info - done
17>-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
17>-- Detecting CXX compile features
17>-- Detecting CXX compile features - done
17>-- Configuring done (4.9s)
17>-- Generating done (2.1s)
17>-- Build files have been written to: C:/github/jeffbolznv/llama.cpp/build/ggml/src/ggml-vulkan/vulkan-shaders.spv
17>Build vulkan shaders
17>MSBuild version 17.5.1+f6fdcf537 for .NET Framework
17>MSBUILD : error MSB1009: Project file does not exist.
17>Switch: all.vcxproj
17>C:\Program Files\Microsoft Visual Studio\2022\Professional\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(247,5): error MSB8066: Custom build for 'C:\github\jeffbolznv\llama.cpp\build\CMakeFiles\0510dad5cdffa47c7f8757fcdb4746db\CMakeLists.txt.rule;C:\github\jeffbolznv\llama.cpp\build\CMakeFiles\1eb63d86afea9cbf795cbb5826c5e109\CMakeCache.txt.rule;C:\github\jeffbolznv\llama.cpp\build\CMakeFiles\1245e0c6710d744490760faba3a5abad\ggml-vulkan-shaders.hpp.rule;C:\github\jeffbolznv\llama.cpp\ggml\src\ggml-vulkan\CMakeLists.txt' exited with code 1.
17>Done building project "ggml-vulkan.vcxproj" -- FAILED.

@Acly
Copy link
Collaborator Author

Acly commented Sep 21, 2025

Why can't this all be done as cmake logic in the main makefile - get a list of shader files and generate targets for each of them?

It can, but since it's not a simple 1-1 relationship between .comp and .spv it requires translating and maintaining all the logic that generates the many variants to CMake.

Does not build for me with msvc

I use MSVC, but didn't try MSBuild... it should be fixed now. But it will not build shaders in parallel, MSBuild doesn't support that. I'm not sure why you're using it, ninja has been the default in MSVC for years and generally builds much faster.

@jeffbolznv
Copy link
Collaborator

I was able to build with the latest commit, but when I run test-backend-ops I got this:

Z:\github\jeffbolznv\llama.cpp\build\bin\RelWithDebInfo>test-backend-ops.exe
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 5090 (NVIDIA) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2
Testing 2 devices

Backend 1/2: Vulkan0
  Device description: NVIDIA GeForce RTX 5090
  Device memory: 32191 MB (31415 MB free)

  ABS(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  ABS(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
  SGN(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  SGN(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
  NEG(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  NEG(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
  STEP(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  STEP(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
C:\github\jeffbolznv\llama.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:1547: ggml_vulkan: Failed to open shader file: C:/github/jeffbolznv/llama.cpp/build/ggml/src/ggml-vulkan/vulkan-shaders.spv/tanh_f16.spv

@Acly
Copy link
Collaborator Author

Acly commented Sep 22, 2025

Hm, so it didn't build shaders at all?

Here is what I'm using to test:

C:\Dev\llama.cpp [vulkan-shader-dev ≡]> cmake -S . -B build -D GGML_VULKAN=ON -D GGML_VULKAN_SHADER_DEV=ON
-- Building for: Visual Studio 17 2022
-- The C compiler identification is MSVC 19.44.35214.0
-- The CXX compiler identification is MSVC 19.44.35214.0
...
-- Build files have been written to: C:/Dev/llama.cpp/build

C:\Dev\llama.cpp [vulkan-shader-dev ≡]> cmake --build build --config RelWithDebInfo
...

C:\Dev\llama.cpp [vulkan-shader-dev ≡]> .\build\bin\RelWithDebInfo\test-backend-ops.exe
Backend 1/2: Vulkan0
  Device description: NVIDIA GeForce RTX 4070
  Device memory: 12012 MB (11241 MB free)
...
  14471/14471 tests passed
  Backend Vulkan0: OK

I also tested just doing the first command, opening the .sln in VS IDE to build from there, and running test-backend-ops.

Both are extremely slow to build though, so limited practical use...

@jeffbolznv
Copy link
Collaborator

Oh, I think I see:

C:\github\jeffbolznv\llama.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:1547: ggml_vulkan: Failed to open shader file: C:/github/jeffbolznv/llama.cpp/build/ggml/src/ggml-vulkan/vulkan-shaders.spv/tanh_f16.spv

Looks like it embedded an absolute path, but I run the tests on a separate test system with this mapped as the Z drive.

@Acly
Copy link
Collaborator Author

Acly commented Sep 22, 2025

Right, I went with absolute paths so it can be launched from different places (root dir, build/bin, build/bin/Release), but it means it's not relocatable.

Copying to machine with different GPUs sounds like something that might be nice to do though, I can try to add some search path mechanism (try some common relative paths and fall back to absolute if not found)

@jeffbolznv
Copy link
Collaborator

Why can't this all be done as cmake logic in the main makefile - get a list of shader files and generate targets for each of them?

It can, but since it's not a simple 1-1 relationship between .comp and .spv it requires translating and maintaining all the logic that generates the many variants to CMake.

I did a proof of concept of what I had in mind. It doesn't require moving the logic to cmake, just invoking the process once per source file and filtering by file name. The incremental build and link times are much better (well, except for mul_mm.comp, but even that one is somewhat improved).

Please take a look: master...jeffbolznv:llama.cpp:shader_gen_per_file

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 28, 2025

Let's get this merged in some way, it would be quite useful. I don't mind the current version, but what do you think of @jeffbolznv 's proposal @Acly ?

@Acly
Copy link
Collaborator Author

Acly commented Sep 28, 2025

My initial thought is that I don't like it conceptually, it would be better to have a custom executable that only writes a list of what to build, and let make/ninja build it. Remove the custom process launching code, and let the build-system do what they're good at, so they can see all compilation and distribute them to cores/threads efficiently. I don't like multiple build-system threads launching multiple shader-gen which then launch multiple glslc without coordination. But maybe this is only a theoretical problem.

The proposal certainly has some practical advantages, fewer changes and better compatibility with all the CMake generators. I want to test it and see how it compares for full build performance, just didn't have much time to do it lately. I think I'll get around to it in next couple days, so let's keep the PR open for now.

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 29, 2025

No worries, take your time. I just didn't want this to disappear in the backlog.

@Acly
Copy link
Collaborator Author

Acly commented Sep 29, 2025

Created #16341 based on the draft by @jeffbolznv and some of the improvements from this PR.

Full builds are slightly slower than this PR, but it's not a big difference and my measurements aren't super precise.
Both are slower than master (on my system ~2:35 instead of ~2:25). The Ninja-only solution was the fastest, but forcing the dependency has its own issues.

I'd close this and go with #16341, mostly because it doesn't need a special CMake option and has the lowest risk of failing on various build setups/platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants