Possible Vulkan Bug? 3.14.0 CTDs splitting layers over native + D3D12 wrapper #515

praetoras-del · 2025-10-18T15:42:27Z

praetoras-del
Oct 18, 2025

Sorry, I wasn't sure how to summarize that title better. I will admit this is a new arena to me, so if I'm doing something dumb, missing something, or there's a feature or option that I ought to be using to fix this, I am more than happy to correct it. Otherwise, I think I may have found some bugs in the implementation, but I wanted to check here for solutions and support before submitting a bug report to make sure this was actually a bug and not just me doing something wrong.

Context:

Dependency	Version
Operating System	11 Home
CPU	AMD Ryzen 7 6800H with Radeon Graphics, 3201 Mhz, 8 Core(s), 16 Logical Processor(s)
Node.js version	22.18.0 (x64)
Typescript version	TypeScript: 5.9.3
`node-llama-cpp` version	3.14.0

npx --yes node-llama-cpp inspect gpu output:

OS: Windows 10.0.26100 (x64)
Node: 22.18.0 (x64)
TypeScript: 5.9.3

node-llama-cpp: 3.14.0
Prebuilt binaries: b6673

CUDA: NVIDIA driver is installed, but CUDA runtime is not
To resolve errors related to CUDA, see the CUDA guide: https://node-llama-cpp.withcat.ai/guide/CUDA
Vulkan: available

Vulkan devices: NVIDIA GeForce RTX 3070 Ti Laptop GPU, Microsoft Direct3D12 (NVIDIA GeForce RTX 3070 Ti Laptop GPU)
Vulkan used VRAM: 4.89% (785.41MB/15.66GB)
Vulkan free VRAM: 95.1% (14.89GB/15.66GB)

CPU model: AMD Ryzen 7 6800H with Radeon Graphics
Math cores: 0
Used RAM: 48.99% (15.28GB/31.19GB)
Free RAM: 51% (15.91GB/31.19GB)
Used swap: 46.42% (27.48GB/59.19GB)
Max swap size: 59.19GB
mmap: supported
Vulkan warning: Vulkan VK_EXT_memory_budget extension not supported for device "Microsoft Direct3D12 (AMD Radeon(TM) Graphics)", so VRAM info cannot be determined for itVulkan warning: Vulkan VK_EXT_memory_budget extension not supported for

Environment variable seems to be ignored or overridden
Both native driver AND D3D12 wrapper for the same GPU are treated as separate physical devices, so layers are getting split across what llama.cpp thinks are two GPUs but are actually the same hardware
CTDs when trying to execute on the D3D12 device during token generation

Logs of backend splitting model layers across device 2 (native NVIDIA) and device 3 (D3D12 wrapper of the same physical GPU), then crashing during token generation with:

ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 1)

Happens regardless of whether or not I set GGML_VK_VISIBLE_DEVICES to specific device indices - the environment variable appears to be completely ignored. When I don't set it, and let ggml-vulkan.cpp's current deduplication code run, it's still trying to split it 'across' devices.

Test 1: Set GGML_VK_VISIBLE_DEVICES=2 (native NVIDIA)

export GGML_VK_VISIBLE_DEVICES=2

Expected: Only use device 2 (native NVIDIA Vulkan driver)
Actual: Still uses both Vulkan2 AND Vulkan3

Layers 0-19 assigned to Vulkan2
Layers 20-40 assigned to Vulkan3
Crashes during token generation

Test 2: Set GGML_VK_VISIBLE_DEVICES=2,2

export GGML_VK_VISIBLE_DEVICES=2,2

Rationale: Attempt to force single device by specifying it twice
Expected: Only use device 2 for all layers
Actual: Same behavior as Test 1 - still splits across both devices and crashes

Test 3: Set GGML_VK_VISIBLE_DEVICES=1 (AMD integrated - I did this basically just to see if it tried or if it ignored it)

export GGML_VK_VISIBLE_DEVICES=1

Expected: Only use device 1 (AMD integrated GPU)
Actual: Environment variable completely ignored - STILL uses Vulkan2 and Vulkan3 (NVIDIA devices)

Test 4: Don't set environment variable

unset GGML_VK_VISIBLE_DEVICES

Expected: Automatic device selection with UUID deduplication
Actual: Same splitting behavior - uses both Vulkan2 and Vulkan3, crashes

So basically, even when explicitly setting the environment variable:

[LlamaModel] Setting device environment variables:
[LlamaModel]   GGML_VK_VISIBLE_DEVICES = 1
[LlamaModel]   VK_DEVICE_SELECT = 1

llama.cpp STILL uses both NVIDIA devices:

[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan2 (NVIDIA GeForce RTX 3070 Ti Laptop GPU) (0000:01:00.0) - 7215 MiB free
[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan3 (Microsoft Direct3D12 (NVIDIA GeForce RTX 3070 Ti Laptop GPU)) (unknown id) - 8018 MiB free

Then splits layers:

[node-llama-cpp] load_tensors: layer   0 assigned to device Vulkan2, is_swa = 0
[node-llama-cpp] load_tensors: layer   1 assigned to device Vulkan2, is_swa = 0
...
[node-llama-cpp] load_tensors: layer  19 assigned to device Vulkan2, is_swa = 0
[node-llama-cpp] load_tensors: layer  20 assigned to device Vulkan3, is_swa = 0
[node-llama-cpp] load_tensors: layer  21 assigned to device Vulkan3, is_swa = 0
...
[node-llama-cpp] load_tensors: layer  40 assigned to device Vulkan3, is_swa = 0

Final error:

[node-llama-cpp] ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 1)

Excerpt from the CTD crashdump:

SYMBOL_NAME:  vulkan_dzn+c144
MODULE_NAME: vulkan_dzn
IMAGE_NAME:  vulkan_dzn.dll
STACK_COMMAND: ~33s; .ecxr ; kb
FAILURE_BUCKET_ID:  INVALID_POINTER_READ_c0000005_vulkan_dzn.dll!Unknown
OSPLATFORM_TYPE:  x64

Other Thoughts:

In ggml-vulkan.cpp I noticed a few things when I tried to dig into this:

lines 4694-4811:

The code has two separate paths for device enumeration:

char * devices_env = getenv("GGML_VK_VISIBLE_DEVICES");
if (devices_env != nullptr) {
    Path A: Just add whatever indices are specified
    <NO UUID DEDUPLICATION HAPPENS HERE>
    vk_instance.device_indices.push_back(tmp);
} else {
    Path B: Smart enumeration with UUID deduplication (lines 4717-4794)
    This checks if devices have matching deviceUUIDs and picks the best driver
    Though notably, even when I try to run it though this, deduplication fails to stop it from splitting between the GPU and its wrapper.
}

Basically, if I'm understanding this right: When GGML_VK_VISIBLE_DEVICES is set, the code completely bypasses the UUID deduplication logic that's supposed to filter out duplicate physical devices with different drivers.

lines 4767-4773 (still in that function)
The driver priority map for NVIDIA devices:

case VK_VENDOR_ID_NVIDIA:
    driver_priorities[vk::DriverId::eNvidiaProprietary] = 1;  <- Highest priority
    driver_priorities[vk::DriverId::eMesaNvk] = 2; <- Lower priority
    break;

According to this, since there's no entry for vk::DriverId::eMesaDozen (the D3D12 wrapper driver), it should be getting the default priority of max, which in turn ought to make it hand the process over to anything with higher priority (ideally the GPU)... but if that's the case, why is it failing?

And, in case it helps, I'll just go ahead and throw this in here, from my app test runs where I'm trying to integrate this - I've been testing with Phi4:

[node-llama-cpp] ggml_vulkan: Found 4 Vulkan devices:
[node-llama-cpp] ggml_vulkan: 0 = Microsoft Direct3D12 (AMD Radeon(TM) Graphics) (Dozen) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 1 | matrix cores: none
[node-llama-cpp] ggml_vulkan: 1 = AMD Radeon(TM) Graphics (AMD proprietary driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
[node-llama-cpp] ggml_vulkan: 2 = NVIDIA GeForce RTX 3070 Ti Laptop GPU (NVIDIA) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2
[node-llama-cpp] ggml_vulkan: 3 = Microsoft Direct3D12 (NVIDIA GeForce RTX 3070 Ti Laptop GPU) (Dozen) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 1 | matrix cores: none

[node-llama-cpp] register_backend: registered backend Vulkan (4 devices)
[node-llama-cpp] register_device: registered device Vulkan0 (Microsoft Direct3D12 (AMD Radeon(TM) Graphics))
[node-llama-cpp] register_device: registered device Vulkan1 (AMD Radeon(TM) Graphics)
[node-llama-cpp] register_device: registered device Vulkan2 (NVIDIA GeForce RTX 3070 Ti Laptop GPU)
[node-llama-cpp] register_device: registered device Vulkan3 (Microsoft Direct3D12 (NVIDIA GeForce RTX 3070 Ti Laptop GPU))

[node-llama-cpp] register_backend: registered backend CPU (1 devices)
[node-llama-cpp] register_device: registered device CPU (AMD Ryzen 7 6800H with Radeon Graphics         )
[node-llama-cpp] Vulkan warning: Vulkan VK_EXT_memory_budget extension not supported for device "Microsoft Direct3D12 (AMD Radeon(TM) Graphics)", so VRAM info cannot be determined for it
[node-llama-cpp] Vulkan warning: Vulkan VK_EXT_memory_budget extension not supported for device "Microsoft Direct3D12 (AMD Radeon(TM) Graphics)", so VRAM info cannot be determined for it

[node-llama-cpp] Vulkan warning: Vulkan VK_EXT_memory_budget extension not supported for device "Microsoft Direct3D12 (AMD Radeon(TM) Graphics)", so VRAM info cannot be determined for it
[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan2 (NVIDIA GeForce RTX 3070 Ti Laptop GPU) (0000:01:00.0) - 7215 MiB free
[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan3 (Microsoft Direct3D12 (NVIDIA GeForce RTX 3070 Ti Laptop GPU)) (unknown id) - 8018 MiB free

[node-llama-cpp] load_tensors: tensor 'token_embd.weight' (q4_K) (and 0 others) cannot be used with preferred buffer type Vulkan_Host, using CPU instead
[node-llama-cpp] load_tensors: offloading 40 repeating layers to GPU
[node-llama-cpp] load_tensors: offloading output layer to GPU
[node-llama-cpp] load_tensors: offloaded 41/41 layers to GPU
[node-llama-cpp] load_tensors:      Vulkan2 model buffer size =  3976.37 MiB
[node-llama-cpp] load_tensors:      Vulkan3 model buffer size =  4378.34 MiB
[node-llama-cpp] load_tensors:   CPU_Mapped model buffer size =   275.62 MiB
[node-llama-cpp] ..........................................................................................

[node-llama-cpp] Vulkan warning: Vulkan VK_EXT_memory_budget extension not supported for device "Microsoft Direct3D12 (AMD Radeon(TM) Graphics)", so VRAM info cannot be determined for it
[node-llama-cpp] llama_context: constructing llama_context
[node-llama-cpp] llama_context: n_seq_max     = 1
[node-llama-cpp] llama_context: n_ctx         = 4096
[node-llama-cpp] llama_context: n_ctx_per_seq = 4096
[node-llama-cpp] llama_context: n_batch       = 512
[node-llama-cpp] llama_context: n_ubatch      = 512
[node-llama-cpp] llama_context: causal_attn   = 1
[node-llama-cpp] llama_context: flash_attn    = disabled
[node-llama-cpp] llama_context: kv_unified    = false
[node-llama-cpp] llama_context: freq_base     = 250000.0
[node-llama-cpp] llama_context: freq_scale    = 1
[node-llama-cpp] llama_context: n_ctx_per_seq (4096) < n_ctx_train (16384) -- the full capacity of the model will not be utilized
[node-llama-cpp] set_abort_callback: call
[node-llama-cpp] llama_context: Vulkan_Host  output buffer size =     0.38 MiB
[node-llama-cpp] create_memory: n_ctx = 4096 (padded)
[node-llama-cpp] llama_kv_cache: layer   0: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   1: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   2: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   3: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   4: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   5: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   6: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   7: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   8: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer   9: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  10: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  11: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  12: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  13: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  14: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  15: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  16: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  17: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  18: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  19: dev = Vulkan2
[node-llama-cpp] llama_kv_cache: layer  20: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  21: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  22: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  23: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  24: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  25: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  26: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  27: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  28: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  29: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  30: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  31: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  32: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  33: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  34: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  35: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  36: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  37: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  38: dev = Vulkan3
[node-llama-cpp] llama_kv_cache: layer  39: dev = Vulkan3
[node-llama-cpp] llama_kv_cache:    Vulkan2 KV buffer size =   400.00 MiB
[node-llama-cpp] llama_kv_cache:    Vulkan3 KV buffer size =   400.00 MiB
[node-llama-cpp] llama_kv_cache: size =  800.00 MiB (  4096 cells,  40 layers,  1/1 seqs), K (f16):  400.00 MiB, V (f16):  400.00 MiB
[node-llama-cpp] llama_context: enumerating backends
[node-llama-cpp] llama_context: backend_ptrs.size() = 3
[node-llama-cpp] llama_context: max_nodes = 1944
[node-llama-cpp] llama_context: reserving full memory module
[node-llama-cpp] llama_context: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
[node-llama-cpp] graph_reserve: reserving a graph for ubatch with n_tokens =  512, n_seqs =  1, n_outputs =  512
[node-llama-cpp] ggml_gallocr_reserve_n: reallocating Vulkan2 buffer from size 0.00 MiB to 353.01 MiB
[node-llama-cpp] ggml_gallocr_reserve_n: reallocating Vulkan3 buffer from size 0.00 MiB to 353.01 MiB
[node-llama-cpp] ggml_gallocr_reserve_n: reallocating Vulkan_Host buffer from size 0.00 MiB to 23.01 MiB
[node-llama-cpp] graph_reserve: reserving a graph for ubatch with n_tokens =    1, n_seqs =  1, n_outputs =    1
[node-llama-cpp] graph_reserve: reserving a graph for ubatch with n_tokens =  512, n_seqs =  1, n_outputs =  512
[node-llama-cpp] llama_context:    Vulkan2 compute buffer size =   353.01 MiB
[node-llama-cpp] llama_context:    Vulkan3 compute buffer size =   353.01 MiB
[node-llama-cpp] llama_context: Vulkan_Host compute buffer size =    23.01 MiB
[node-llama-cpp] llama_context: graph nodes  = 1366
[node-llama-cpp] llama_context: graph splits = 3

Answered by giladgd

Oct 19, 2025

@praetoras-del Thank you for reporting and doing such a comprehensive investigation!

It seems peculiar to me that the same GPU appears twice, the Microsoft Direct3D12 entry most likely reports a different UUID so the deduplication code regards it as a different device.
In these logs:

[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan2 (NVIDIA GeForce RTX 3070 Ti Laptop GPU) (0000:01:00.0) - 7215 MiB free
[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan3 (Microsoft Direct3D12 (NVIDIA GeForce RTX 3070 Ti Laptop GPU)) (unknown id) - 8018 MiB free

you can see that the PCI identifier of the first device is 0000:01:00.0 while for the second one it's unkn…

View full answer

giladgd · 2025-10-19T15:18:57Z

giladgd
Oct 19, 2025
Maintainer

@praetoras-del Thank you for reporting and doing such a comprehensive investigation!

It seems peculiar to me that the same GPU appears twice, the Microsoft Direct3D12 entry most likely reports a different UUID so the deduplication code regards it as a different device.
In these logs:

[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan2 (NVIDIA GeForce RTX 3070 Ti Laptop GPU) (0000:01:00.0) - 7215 MiB free
[node-llama-cpp] llama_model_load_from_file_impl: using device Vulkan3 (Microsoft Direct3D12 (NVIDIA GeForce RTX 3070 Ti Laptop GPU)) (unknown id) - 8018 MiB free

you can see that the PCI identifier of the first device is 0000:01:00.0 while for the second one it's unknown, so I suspect that the second device also reports a different UUID.

I researched a bit and it appears that the (Dozen) that you see tells you that it comes from the vulkan-dozen driver installed in your linux distro (have you installed some drivers manually or was it installed out of the box?), which is a Vulkan driver implementation that uses Microsoft Direct3D12.
I've found a note regarding this on VulkanHub so I've implemented a potential fix for the deduplication algorithm.
I'd love if you could test it and let me know whether it fixed the issue so I'll know how to proceed.

To try it, first install the prerequisites for building the Vulkan backend, and then run these commands:

npm install node-llama-cpp@latest
npx --no node-llama-cpp source download --gpu vulkan --repo giladgd/llama.cpp --release b6795.1
npx --no node-llama-cpp inspect gpu

vulkaninfo

Please share the outputs of these commands so I can see whether it works as I expect.

I have a few other implementation details that I would want to test to make the fix more robust, but this should be a good first test to see whether this fix is in the right direction.

8 replies

praetoras-del Oct 20, 2025
Author

Nevermind - I spoke too soon. Through fire and flame, I finally got the build to work. And good news!

npx --no node-llama-cpp inspect gpu said Vulkan was found but was failed to be used like last time. However....

npx --no node-llama-cpp chat --prompt "Hi there" hf:giladgd/gpt-oss-20b-GGUF/gpt-oss-20b.MXFP4.gguf --debug Worked! It replied, and did not crash. I'm attaching the file here.

Hopefully I did everything right.
new_test_results_fixed.txt

If I did something wrong, feel free to let me know I'll try to get it working properly, heh. But a reply is progress!
(Minor edit to undo accidental path change - was tired)

giladgd Oct 20, 2025
Maintainer

I see that the duplicate D3D12 devices do not appear here in the logs, so I think the fix worked!

ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon(TM) Graphics (AMD proprietary driver) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
ggml_vulkan: 1 = NVIDIA GeForce RTX 3070 Ti Laptop GPU (NVIDIA) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2
register_backend: registered backend Vulkan (2 devices)
register_device: registered device Vulkan0 (AMD Radeon(TM) Graphics)
register_device: registered device Vulkan1 (NVIDIA GeForce RTX 3070 Ti Laptop GPU)

These logs later on come from node-llama-cpp's code that reads additional information from Vulkan that I haven't adapted to these changes yet, but it's an indication that the 3D12 device didn't disappear, so the deduplication code seems to properly work now:
Vulkan warning: Vulkan VK_EXT_memory_budget extension not supported for device "Microsoft Direct3D12 (AMD Radeon(TM) Graphics)", so VRAM info cannot be determined for it
Vulkan warning: Vulkan VK_EXT_memory_budget extension not supported for device "Microsoft Direct3D12 (AMD Radeon(TM) Graphics)", so VRAM info cannot be determined for it

Just to make sure, can you please run these commands in sequence so we can make sure that the patch I made fixed it and not some side effect from the build tool installations?

npx --no node-llama-cpp source clear
npx --no node-llama-cpp inspect gpu
vulkaninfo
npx --no node-llama-cpp chat --prompt "Hi there" hf:giladgd/gpt-oss-20b-GGUF/gpt-oss-20b.MXFP4.gguf --debug

npx --no node-llama-cpp source download --gpu vulkan --repo giladgd/llama.cpp --release b6795.1
npx --no node-llama-cpp inspect gpu
vulkaninfo
npx --no node-llama-cpp chat --prompt "Hi there" hf:giladgd/gpt-oss-20b-GGUF/gpt-oss-20b.MXFP4.gguf --debug

praetoras-del Oct 20, 2025
Author

It worked!

Initial scrub and run: generation failed.
After rebuilding the fix and using that: generation succeeded :)

I've attached the logs from everything, and used bars to separate the sections to make the sections easy to scroll through.

Really appreciate the help with this!

clean_retest_2.txt
clean_test_1.txt
clean_build_2.txt

giladgd Oct 20, 2025
Maintainer

Thanks for testing again!
I've opened ggml-org/llama.cpp#16689 to fix this on llama.cpp and I'll release a new version of node-llama-cpp after it gets merged.

You can use the npx --no node-llama-cpp source download --gpu vulkan --repo giladgd/llama.cpp --release b6795.1 command in the meantime to use node-llama-cpp until a new version is released, and I'll let you know when the new version is out so you can test it.

praetoras-del Oct 20, 2025
Author

Thank you again! I'll be ready to help with the tests!

Uh oh!

Possible Vulkan Bug? 3.14.0 CTDs splitting layers over native + D3D12 wrapper #515

Uh oh!

Uh oh!

praetoras-del Oct 18, 2025

Test 1: Set GGML_VK_VISIBLE_DEVICES=2 (native NVIDIA)

Test 2: Set GGML_VK_VISIBLE_DEVICES=2,2

Test 3: Set GGML_VK_VISIBLE_DEVICES=1 (AMD integrated - I did this basically just to see if it tried or if it ignored it)

Test 4: Don't set environment variable

Other Thoughts:

Replies: 1 comment · 8 replies

Uh oh!

giladgd Oct 19, 2025 Maintainer

Uh oh!

Uh oh!

praetoras-del Oct 20, 2025 Author

Uh oh!

Uh oh!

giladgd Oct 20, 2025 Maintainer

Uh oh!

Uh oh!

praetoras-del Oct 20, 2025 Author

Uh oh!

giladgd Oct 20, 2025 Maintainer

Uh oh!

praetoras-del Oct 20, 2025 Author

praetoras-del
Oct 18, 2025

Replies: 1 comment 8 replies

giladgd
Oct 19, 2025
Maintainer

praetoras-del Oct 20, 2025
Author

giladgd Oct 20, 2025
Maintainer

praetoras-del Oct 20, 2025
Author

giladgd Oct 20, 2025
Maintainer

praetoras-del Oct 20, 2025
Author