Automatically mark `objc2-metal` as safe #792

madsmtm · 2025-09-29T16:38:15Z

#783 relaxed our safety policy to allow us to mark things as automatically safe. This would be very useful to do in objc2-metal as well, which is what this PR does.

There are a lot of wrinkles to that though, since Metal is fairly low-level, and there's a lot of seemingly innocuous things that can cause memory safety issues down the line, it effectively allows doing arbitrary things on another processor (the GPU). One example of this is MTLFunction, which amounts to an unsafe extern "C" fn() that the GPU will gladly call with whatever arguments you specify, and thus we have to mark functions taking it as unsafe.

In a similar vein, all MTLResources are marked as unsafe, as accesses to these are unsynchronized, can be unretained, and need to be bound before use. MTLBuffer is also untyped. See also the more broad questions raised in #773 (comment).

For now, I'd like to postpone the discussion of bounds safety, I've used unsafe-default-safety.not-bounds-affecting = true to do that. It still has a lot of false positives, I have fixed some of those in this PR, but we'll have to deal with the rest later. I have opened #793 to track this.

Replaces #773, #774 and #790. Closes #685. This is important for gfx-rs/wgpu#5641 and for making Metal 4 nicer to use.

framework-crates/objc2-metal/translation-config.toml

MarijnS95 · 2025-10-01T13:06:25Z

I'm trying this out on our codebase, and stuck in a hard place between using this or #773 as a basis for a change pending other changes in the upcoming release. A few function thoughts:

MTLHeapDescriptor setType: should remain safe or that's a semver break;
dispatchThreadgroupsWithIndirectBuffer:threadsPerThreadgroup: should probably be marked unsafe because it takes a MTLGPUAddress;
- I tried debugging fn can_affect_bounds() and it should catch this, but it equally doesn't catch some MTL4BufferRange?
gpuResourceID() was previously unsafe, it allows to do unsafe things but isn't inherently unsafe itself so perhaps it should be treated as safe like many .as_ptr() on Rust slices etc;

setMaxBufferBindCount: / setMaxTextureBindCount: / setMaxSamplerStateBindCount: could probably be marked as safe, because their values are "bounds" checked later:

-[MTLDebugDevice newArgumentTableWithDescriptor:error:]:4160: failed assertion `Argument Table Descriptor Validation
max texture bind count must not be more than 128.

EDIT: I pushed the fixes at metal-safe...Traverse-Research:objc2:metal-safe

madsmtm · 2025-10-01T15:17:29Z

I tried debugging fn can_affect_bounds() and it should catch this, but it equally doesn't catch some MTL4BufferRange?

That's because there's now two checks: Both can_affect_bounds, and the is_likely_bounds_affecting here:

objc2/crates/header-translator/src/name_translation.rs

Lines 449 to 462 in 9bd895f

    
           /// Whether a parameter or function name is likely to require a bounds check. 
        
           /// 
        
           /// This is only a best-effort heuristic, the library author may have called 
        
           /// this any number of other things. 
        
           pub(crate) fn is_likely_bounds_affecting(name: &str) -> bool { 
        
               let name = name.to_lowercase(); 
        
               name.contains("idx") 
        
                   || name.contains("index") 
        
                   || name == "i" 
        
                   || name.contains("capacity") 
        
                   || name.contains("range") 
        
                   || name.contains("offset") 
        
                   || name.contains("count") 
        
           }

EDIT: I pushed the fixes at metal-safe...Traverse-Research:objc2:metal-safe

Thanks, I pulled that in.

MarijnS95 · 2025-10-01T15:19:04Z

Ah! If you extend that function, the half-assed exceptions from my commit can go again. It's surely not complete.

madsmtm · 2025-10-01T21:39:18Z

I've looked through the documentation on Metal resources, and I think the correct choice is to mark (basically) all resource (MTLBuffer/MTLTexture/``) accesses as unsafe, since the user has to synchronize accesses to these?

I have implemented this in the latest commit, please let me know what you think?

madsmtm · 2025-10-01T22:09:36Z

framework-crates/objc2-metal/translation-config.toml

+# SAFETY: Modifying residency is safe, it's effectively the same as
+# controlling what's in the L1/L2/L3 cache on the CPU.
+# protocol.MTLResidencySet.methods.requestResidency.unsafe = false
+# protocol.MTLResidencySet.methods.endResidency.unsafe = false


@MarijnS95 could you verify this?

As I understand it, residency sets are only an optimization knob? If a resource is needed, but isn't resident, it's still automatically made resident, right?

@madsmtm sure, there are two stages to it: the methods quoted here are an optimization knob to make things resident ahead of time. Everything in a residency set attached to a command buffer (or queue before a command buffer was committed to it) will be made/kept resident when the command buffer is supposedly going to be running...

Caution
Or at least so I thought, requestResidency is quite clear that it performs some work upfront to make commit: faster but endResidency directly says it makes things no longer resident which -to me- seem to defeat the advantage of attaching it to command buffers. I don't know what will happen if that's called on a MTLResidencySet that's attached to a command buffer that's Enqueued but not Completed.

What also counts is what resources are in the set at any given point in time. As far as I know the state after commit: is not buffered, so if anyone removes a resource and commits it again while a MTLCommandBuffer is still in flight that uses said resource, things could go kaboom as well when something in the system wants to reclaim memory (by calling endResidency?).

In the end I don't exactly understand the significance of these functions. Maybe we're asking too complicated questions, the documentation is quite terse.

I'm not confident what residency means on Unified Memory architectures (unless page files exist and are used in times of high memory pressure); perhaps buffers that "exist" in memory may not be in the MMU for the GPU (perhaps per-process)?

I tested things a bit by modifying this example to create a MTLResidencySet and add _terrainTextures to that instead of using useResource:usage:, roughly:

MTLResidencySetDescriptor *setDescriptor = [[MTLResidencySetDescriptor alloc] init]; NSError* error; id<MTLResidencySet> residencySet = [device newResidencySetWithDescriptor:setDescriptor error:&error]; if (!residencySet) { NSLog(@"Failed to create residence set, error %@", error); } for (int i = 0; i < _terrainTextures.size(); i++) { [residencySet addAllocation: _terrainTextures[i].diffSpecTextureArray]; [residencySet addAllocation: _terrainTextures[i].normalTextureArray]; } [residencySet commit]; [commandQueue addResidencySet: residencySet];

Things didn't work without doing either one of these, using both didn't change the output. Re-reading, I guess that's what is stated in the documentation, so that probably makes sense. It also matches what the Metal debugger showed.

I tried calling both endResidency on each frame, it didn't change the result, but the CPU usage grew from ~30% to ~50%, so that seems to only be an optimization.

In any case, pretty sure we can consider residency and useResource:usage: as safe, we can move the unsafety of not having a resource bound to setTexture:atIndex: and similar command encoder functions?

madsmtm · 2025-10-01T22:14:33Z

framework-crates/objc2-metal/translation-config.toml

+# These affect lifetime safety, and can cause use-after-free if used incorrectly.
+class.MTLCommandBufferDescriptor.methods."setRetainedReferences:".unsafe = true
+protocol.MTLCommandQueue.methods.commandBufferWithUnretainedReferences.unsafe = true
+protocol.MTLIOCommandQueue.methods.commandBufferWithUnretainedReferences.unsafe = true


@MarijnS95 pointed out in #773 (comment):

Lifetime safety: This used to be tracked by Metal if not opted out but that already wasn't possible when manually uploading gpuReourceID() values, and is removed now in Metal 4.

Which is just.. Ugh! Not yet sure how we'll resolve that. Maybe all command buffer and command encoder methods need to be unsafe? Maybe it'll be fine since most of them already are?

Yeah, all dispatches on the GPU (shader dispatches, draws, copy commands, ...) should likely be unsafe because of resource lifetimes 🤷

Re-reading the docs for commandBufferWithUnretainedReferences, it says:

Returns a command buffer from the command queue that doesn’t maintain strong references to resources.

So it sounds like the things that aren't retained only applies to MTLResource and subprotocols, and that it'd still keep a reference to any MTLFence, MTLCommandBufferHandler, MTLSamplerState etc. that you pass to the buffer / its encoders?

As such, I think we can again get around this by marking MTLResource as unsafe (and probably add another safety requirement to the docs)?

madsmtm · 2025-10-02T22:06:47Z

I think this PR is ready now. Overall, I believe that marking MTLResource and MTLFunction as unsafe was the correct choice, and that this mostly resolves the safety concerns outlined in #773 (comment).

I've gotten a bit lost between all our threads on this, but I think I've answered all your concerns @MarijnS95? WDYT about this PR in it's current state (if you look past all the bounds-safety stuff)?

(I strongly suspect there's still more to be done here, but I'm honestly reaching my limit of the amount of unsafe code I can review, I find myself glazing over things by now - so while this PR is probably unsound in a few places, I think it's enough of an improvement that it'd be worth it to merge it now anyhow).

This has a lot of false positives, but it's probably better than nothing.

To not consider MTLIndexType and similar as bounds-affecting.

These need to be synchronized.

And mark `commandBufferWithUnretainedReferences` as safe.

madsmtm added enhancement New feature or request A-framework Affects the framework crates and the translator for them I-unsound A soundness hole, or affecting soundness labels Sep 29, 2025

madsmtm mentioned this pull request Sep 29, 2025

Mark Metal as automatically safe #790

Closed

madsmtm added this to the frameworks v0.3.2 milestone Sep 29, 2025

madsmtm force-pushed the metal-safe branch from 1f615b6 to c7e27e2 Compare September 29, 2025 19:24

madsmtm mentioned this pull request Sep 30, 2025

Release framework crates v0.3.2 #794

Draft

6 tasks

MarijnS95 reviewed Sep 30, 2025

View reviewed changes

framework-crates/objc2-metal/translation-config.toml Outdated Show resolved Hide resolved

MarijnS95 reviewed Sep 30, 2025

View reviewed changes

framework-crates/objc2-metal/translation-config.toml Outdated Show resolved Hide resolved

MarijnS95 reviewed Sep 30, 2025

View reviewed changes

framework-crates/objc2-metal/translation-config.toml Outdated Show resolved Hide resolved

madsmtm commented Oct 1, 2025

View reviewed changes

madsmtm force-pushed the metal-safe branch from 06b9c59 to 325ff16 Compare October 2, 2025 21:51

madsmtm marked this pull request as ready for review October 2, 2025 22:06

madsmtm and others added 5 commits October 4, 2025 00:07

Mark MTLFunction as unsafe

7c6c789

Conservatively mark Metal as not bounds-checked

b822e0c

This has a lot of false positives, but it's probably better than nothing.

Mark Metal as automatically safe

7660dba

Relax bounds-checking

5343fb6

To not consider MTLIndexType and similar as bounds-affecting.

Update Metal safety

3ab78a5

madsmtm added 8 commits October 4, 2025 00:11

Avoid breaking changes

0bfd1f6

Mark MTLResource and subprotocols as unsafe

7bdb3f8

These need to be synchronized.

Be less sure about some bounds-affecting methods

17eab2d

Make it clearer that MTLResourceID is probably unsafe

4ce3eab

Mark "stride" methods as maybe bounds-affecting

590b0c3

Mark "size" methods as maybe bounds-affecting

5f13c1c

Document resources as unretained

b646632

And mark `commandBufferWithUnretainedReferences` as safe.

Make residency and resource bound-ness clearer

51ca607

madsmtm force-pushed the metal-safe branch from 325ff16 to 51ca607 Compare October 3, 2025 22:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Automatically mark `objc2-metal` as safe #792

Automatically mark `objc2-metal` as safe #792

madsmtm commented Sep 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MarijnS95 commented Oct 1, 2025 •

edited

Loading

Uh oh!

madsmtm commented Oct 1, 2025

Uh oh!

MarijnS95 commented Oct 1, 2025

Uh oh!

madsmtm commented Oct 1, 2025

Uh oh!

madsmtm Oct 1, 2025

Uh oh!

MarijnS95 Oct 2, 2025 •

edited by madsmtm

Loading

Uh oh!

madsmtm Oct 2, 2025

Uh oh!

madsmtm Oct 1, 2025

Uh oh!

MarijnS95 Oct 2, 2025

Uh oh!

madsmtm Oct 2, 2025

Uh oh!

madsmtm commented Oct 2, 2025

Uh oh!

Uh oh!

Automatically mark objc2-metal as safe #792

Are you sure you want to change the base?

Automatically mark objc2-metal as safe #792

Conversation

madsmtm commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MarijnS95 commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madsmtm commented Oct 1, 2025

Uh oh!

MarijnS95 commented Oct 1, 2025

Uh oh!

madsmtm commented Oct 1, 2025

Uh oh!

madsmtm Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

MarijnS95 Oct 2, 2025 • edited by madsmtm Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

madsmtm Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

madsmtm Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

MarijnS95 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

madsmtm Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

madsmtm commented Oct 2, 2025

Uh oh!

Uh oh!

Automatically mark `objc2-metal` as safe #792

Automatically mark `objc2-metal` as safe #792

madsmtm commented Sep 29, 2025 •

edited

Loading

MarijnS95 commented Oct 1, 2025 •

edited

Loading

MarijnS95 Oct 2, 2025 •

edited by madsmtm

Loading