vulkan : shader development improvements #15993

Acly · 2025-09-14T15:22:27Z

Adds a CMake option GGML_VULKAN_SHADER_DEV (default: off). If enabled, makes tweaking shaders less painful:

reads shader SPIRV from disk instead of embedding it
uses ninja to build shaders for incremental build & dependency tracking

So when you make a change to a shader file and build, exactly 1 glslc invocation will run, and no C++ compilation at all. This is faster by many orders of magnitude, and will no doubt enhance vulkan backend developer efficiency by a similar factor.

…h less compilation * instead of embedding SPIRV in the library, read them from the filesystem instead * don't need to recompile any .cpp files when modifying shaders

* allows incremental builds * tracks shader dependencies * requires ninja to be installed

jeffbolznv · 2025-09-14T18:50:19Z

Does this have to be tied to ninja? I'd be happy to have proper incremental builds for the glsl files. Maybe we should have one generated file per glsl source file or something like that.

Acly · 2025-09-14T22:46:41Z

Does this have to be tied to ninja?

Writing a file with a timestamp for each source file would be okay. Adding transitive dependencies from #include directives is some additional work towards reinventing ninja though.

I'm actually most worried about all the process and filesystem stuff, it's a quagmire of platform-specific weirdness. The current windows pipe handling is already broken and I'm not looking forward to adding sporadic "Access denied" issues to the list.

jeffbolznv · 2025-09-18T13:02:20Z

I don't understand why we'd need to track timestamps ourselves rather than use cmake to handle the dependencies. cmake is kind of annoying for generated source files but it's possible to make it work.

Acly · 2025-09-18T16:01:38Z

Ah, you only mentioned without ninja, so I assumed keeping compiler invocations inside vulkan-shaders-gen as before.

It should be possible to generate a CMake instead of ninja file with custom_command & DEPFILE (last time I did this it was only possible for ninja, but looks like it supports few more generators now).

Acly · 2025-09-19T08:19:07Z

Updated to use CMake.

It's still behind a cmake option atm. @jeffbolznv @0cc4m let me know if you think it's okay to use cmake as the default way to build shaders, then I can simplify by throwing out the old code.

0cc4m · 2025-09-21T07:47:42Z

I'm okay with always using cmake for this. This is in fact very nice to speed up shader iteration, especially with my 2nd gen EPYC I usually have to wait a long time until I can test a change. Thank you!

jeffbolznv · 2025-09-21T15:06:43Z

I'll test this out, but it isn't quite what I had in mind. Why can't this all be done as cmake logic in the main makefile - get a list of shader files and generate targets for each of them? This seems like a pretty common thing that shouldn't require generating and running a new cmake file.

jeffbolznv · 2025-09-21T15:15:43Z

Does not build for me with msvc (with GGML_VULKAN_SHADER_DEV enabled):

17>------ Rebuild All started: Project: ggml-vulkan, Configuration: RelWithDebInfo x64 ------
17>Generate vulkan shaders
17>ggml_vulkan: Generating and compiling shaders to SPIR-V
17>Building Custom Rule C:/github/jeffbolznv/llama.cpp/ggml/src/ggml-vulkan/CMakeLists.txt
17>Configure vulkan shaders
17>-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631.
17>-- The C compiler identification is MSVC 19.35.32217.1
17>-- The CXX compiler identification is MSVC 19.35.32217.1
17>-- Detecting C compiler ABI info
17>-- Detecting C compiler ABI info - done
17>-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
17>-- Detecting C compile features
17>-- Detecting C compile features - done
17>-- Detecting CXX compiler ABI info
17>-- Detecting CXX compiler ABI info - done
17>-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
17>-- Detecting CXX compile features
17>-- Detecting CXX compile features - done
17>-- Configuring done (4.9s)
17>-- Generating done (2.1s)
17>-- Build files have been written to: C:/github/jeffbolznv/llama.cpp/build/ggml/src/ggml-vulkan/vulkan-shaders.spv
17>Build vulkan shaders
17>MSBuild version 17.5.1+f6fdcf537 for .NET Framework
17>MSBUILD : error MSB1009: Project file does not exist.
17>Switch: all.vcxproj
17>C:\Program Files\Microsoft Visual Studio\2022\Professional\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(247,5): error MSB8066: Custom build for 'C:\github\jeffbolznv\llama.cpp\build\CMakeFiles\0510dad5cdffa47c7f8757fcdb4746db\CMakeLists.txt.rule;C:\github\jeffbolznv\llama.cpp\build\CMakeFiles\1eb63d86afea9cbf795cbb5826c5e109\CMakeCache.txt.rule;C:\github\jeffbolznv\llama.cpp\build\CMakeFiles\1245e0c6710d744490760faba3a5abad\ggml-vulkan-shaders.hpp.rule;C:\github\jeffbolznv\llama.cpp\ggml\src\ggml-vulkan\CMakeLists.txt' exited with code 1.
17>Done building project "ggml-vulkan.vcxproj" -- FAILED.

Acly · 2025-09-21T18:18:22Z

Why can't this all be done as cmake logic in the main makefile - get a list of shader files and generate targets for each of them?

It can, but since it's not a simple 1-1 relationship between .comp and .spv it requires translating and maintaining all the logic that generates the many variants to CMake.

Does not build for me with msvc

I use MSVC, but didn't try MSBuild... it should be fixed now. But it will not build shaders in parallel, MSBuild doesn't support that. I'm not sure why you're using it, ninja has been the default in MSVC for years and generally builds much faster.

jeffbolznv · 2025-09-22T02:28:01Z

I was able to build with the latest commit, but when I run test-backend-ops I got this:

Z:\github\jeffbolznv\llama.cpp\build\bin\RelWithDebInfo>test-backend-ops.exe
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 5090 (NVIDIA) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2
Testing 2 devices

Backend 1/2: Vulkan0
  Device description: NVIDIA GeForce RTX 5090
  Device memory: 32191 MB (31415 MB free)

  ABS(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  ABS(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
  SGN(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  SGN(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
  NEG(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  NEG(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
  STEP(type=f16,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
  STEP(type=f16,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
C:\github\jeffbolznv\llama.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:1547: ggml_vulkan: Failed to open shader file: C:/github/jeffbolznv/llama.cpp/build/ggml/src/ggml-vulkan/vulkan-shaders.spv/tanh_f16.spv

Acly · 2025-09-22T07:55:57Z

Hm, so it didn't build shaders at all?

Here is what I'm using to test:

C:\Dev\llama.cpp [vulkan-shader-dev ≡]> cmake -S . -B build -D GGML_VULKAN=ON -D GGML_VULKAN_SHADER_DEV=ON
-- Building for: Visual Studio 17 2022
-- The C compiler identification is MSVC 19.44.35214.0
-- The CXX compiler identification is MSVC 19.44.35214.0
...
-- Build files have been written to: C:/Dev/llama.cpp/build

C:\Dev\llama.cpp [vulkan-shader-dev ≡]> cmake --build build --config RelWithDebInfo
...

C:\Dev\llama.cpp [vulkan-shader-dev ≡]> .\build\bin\RelWithDebInfo\test-backend-ops.exe
Backend 1/2: Vulkan0
  Device description: NVIDIA GeForce RTX 4070
  Device memory: 12012 MB (11241 MB free)
...
  14471/14471 tests passed
  Backend Vulkan0: OK

I also tested just doing the first command, opening the .sln in VS IDE to build from there, and running test-backend-ops.

Both are extremely slow to build though, so limited practical use...

jeffbolznv · 2025-09-22T12:49:00Z

Oh, I think I see:

C:\github\jeffbolznv\llama.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:1547: ggml_vulkan: Failed to open shader file: C:/github/jeffbolznv/llama.cpp/build/ggml/src/ggml-vulkan/vulkan-shaders.spv/tanh_f16.spv

Looks like it embedded an absolute path, but I run the tests on a separate test system with this mapped as the Z drive.

Acly · 2025-09-22T13:44:07Z

Right, I went with absolute paths so it can be launched from different places (root dir, build/bin, build/bin/Release), but it means it's not relocatable.

Copying to machine with different GPUs sounds like something that might be nice to do though, I can try to add some search path mechanism (try some common relative paths and fall back to absolute if not found)

jeffbolznv · 2025-09-22T21:48:04Z

Why can't this all be done as cmake logic in the main makefile - get a list of shader files and generate targets for each of them?

It can, but since it's not a simple 1-1 relationship between .comp and .spv it requires translating and maintaining all the logic that generates the many variants to CMake.

I did a proof of concept of what I had in mind. It doesn't require moving the logic to cmake, just invoking the process once per source file and filtering by file name. The incremental build and link times are much better (well, except for mul_mm.comp, but even that one is somewhat improved).

Please take a look: master...jeffbolznv:llama.cpp:shader_gen_per_file

0cc4m · 2025-09-28T15:09:39Z

Let's get this merged in some way, it would be quite useful. I don't mind the current version, but what do you think of @jeffbolznv 's proposal @Acly ?

Acly · 2025-09-28T18:51:01Z

My initial thought is that I don't like it conceptually, it would be better to have a custom executable that only writes a list of what to build, and let make/ninja build it. Remove the custom process launching code, and let the build-system do what they're good at, so they can see all compilation and distribute them to cores/threads efficiently. I don't like multiple build-system threads launching multiple shader-gen which then launch multiple glslc without coordination. But maybe this is only a theoretical problem.

The proposal certainly has some practical advantages, fewer changes and better compatibility with all the CMake generators. I want to test it and see how it compares for full build performance, just didn't have much time to do it lately. I think I'll get around to it in next couple days, so let's keep the PR open for now.

0cc4m · 2025-09-29T04:25:20Z

No worries, take your time. I just didn't want this to disappear in the backlog.

Acly · 2025-09-29T19:12:33Z

Created #16341 based on the draft by @jeffbolznv and some of the improvements from this PR.

Full builds are slightly slower than this PR, but it's not a big difference and my measurements aren't super precise.
Both are slower than master (on my system ~2:35 instead of ~2:25). The Ninja-only solution was the fastest, but forcing the dependency has its own issues.

I'd close this and go with #16341, mostly because it doesn't need a special CMake option and has the lowest risk of failing on various build setups/platforms.

Acly added 2 commits September 14, 2025 16:55

vulkan : add GGML_VULKAN_SHADER_DEV option for kernel development wit…

0cc76d8

…h less compilation * instead of embedding SPIRV in the library, read them from the filesystem instead * don't need to recompile any .cpp files when modifying shaders

vulkan : use ninja to compile shaders when GGML_VULKAN_SHADER_DEV is set

625449a

* allows incremental builds * tracks shader dependencies * requires ninja to be installed

Acly requested a review from 0cc4m as a code owner September 14, 2025 15:22

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Sep 14, 2025

generate CMakeLists.txt instead of ninja build file

ea2dc96

Acly force-pushed the vulkan-shader-dev branch from 8bb1ee4 to ea2dc96 Compare September 19, 2025 08:03

fix MSBuild

c3230c3

Acly mentioned this pull request Sep 29, 2025

vulkan : incremental shader builds #16341

Open

vulkan : shader development improvements #15993

Are you sure you want to change the base?

vulkan : shader development improvements #15993

Conversation

Acly commented Sep 14, 2025

Uh oh!

jeffbolznv commented Sep 14, 2025

Uh oh!

Acly commented Sep 14, 2025

Uh oh!

jeffbolznv commented Sep 18, 2025

Uh oh!

Acly commented Sep 18, 2025

Uh oh!

Acly commented Sep 19, 2025

Uh oh!

0cc4m commented Sep 21, 2025

Uh oh!

jeffbolznv commented Sep 21, 2025

Uh oh!

jeffbolznv commented Sep 21, 2025

Uh oh!

Acly commented Sep 21, 2025

Uh oh!

jeffbolznv commented Sep 22, 2025

Uh oh!

Acly commented Sep 22, 2025

Uh oh!

jeffbolznv commented Sep 22, 2025

Uh oh!

Acly commented Sep 22, 2025

Uh oh!

jeffbolznv commented Sep 22, 2025

Uh oh!

0cc4m commented Sep 28, 2025

Uh oh!

Acly commented Sep 28, 2025

Uh oh!

0cc4m commented Sep 29, 2025

Uh oh!

Acly commented Sep 29, 2025

Uh oh!

Uh oh!