You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2024-09-20-error-handling.markdown
+12-9Lines changed: 12 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: "David Chisnall"
7
7
---
8
8
9
9
CHERI platforms in general, and CHERIoT in particular, can turn a lot of bugs that would be silent data corruption into recoverable errors.
10
-
The 'recoverable' part comes from the fact that the error is caught *before*you perform the invalid operation.
10
+
The 'recoverable' part comes from the fact that any error is caught *before*an invalid operation succeeds.
11
11
12
12
In [CheriBSD](https://www.cheribsd.org) and the proposed CHERI extensions to POSIX, CHERI faults are delivered as signals.
13
13
In CHERIoT RTOS, we have a similar mechanism.
@@ -16,7 +16,7 @@ This can be used to perform low-level error recovery operations, such as skippin
16
16
17
17
Custom error handlers are quite difficult to write.
18
18
In addition, the design made it impossible to run them when the fault was caused by stack exhaustion.
19
-
Although we have [tools to help avoid stack overflows](https://cheriot.org/rtos/stack/programming/2024/05/01/stack-usage.html), these still happen (particularly when incorporating third-party code) and it's nice to have a general mechanism for using them.
19
+
Although we have [tools to help avoid stack overflows](https://cheriot.org/rtos/stack/programming/2024/05/01/stack-usage.html), these still happen (particularly when incorporating third-party code) and it's nice to have a general mechanism for handling them.
20
20
21
21
With a few recent PRs, we've built a much more developer-friendly mechanism for handling errors.
22
22
@@ -84,19 +84,22 @@ If we built a linked-list of `jmp_buf`s there, then it would be possible for the
84
84
This would be bad.
85
85
86
86
What we want is not *thread-local* storage but *compartment-invocation-local* storage.
87
-
Each time a thread enters a new compartment, you should get some storage that is not tied to the depth in the control stack.
87
+
Each time a thread enters a new compartment, you should get some storage that is not local to the current function but can be accessed from any nested call.
88
88
89
89
To implement this, we looked at the earliest way that operating systems have implemented thread-local storage: reserve some space at the top of the stack.
90
90
91
+
When the switcher transitions between compartments, it truncates the stack so that the callee doesn't have access to the caller's stack.
91
92
Now, on entry into a compartment, the switcher will move the stack pointer 16 bytes down before transferring control into the callee.
92
93
Similarly, the loader will reserve 16 bytes at the top of the stack before starting a thread.
93
94
94
95
This means that you have two pointers worth of space that are easy to find (set the address of the stack pointer to its top, then set the address to eight or 16 bytes below that).
95
-
CHERIoT has a convenient `cgettop` instruction and so this sequence is very short. First, `cgettop` gives the top address, then `csetaddress` gives a new capability derived from the stack pointer that points to the address. After that, a `-8` immediate offset to a load or store capability instruction can access the space, so we need only three instructions to load the head of the list.
96
+
CHERIoT has a convenient `cgettop` instruction and so this sequence is very short.
97
+
First, `cgettop` gives the top address, then `csetaddress` gives a new capability derived from the stack pointer that points to the address.
98
+
After that, a `-8` immediate offset to a load or store capability instruction can access the space, so we need only three instructions to load the head of the list.
96
99
97
-
With this, we can store the head of the linked list of error handlers at the top of the stack.
100
+
With this, we can store the head of the linked list of error handlers at the top of the stack for the current compartment invocation.
98
101
When you want to jump to the nearest error handler, find the head of the list relative to the stack pointer, pop it, and pass it to `longjmp`.
99
-
The `cleanup_unwind` does all of this for you, so typically you won't need to ever see that this is how it's implemented.
102
+
The `cleanup_unwind`function does all of this for you, so typically you won't need to ever see that this is how it's implemented.
100
103
101
104
# Handling errors even in the presence of stack overflow
102
105
@@ -116,7 +119,7 @@ But what happens if you want to jump to the nearest error handler registered as
116
119
First, the CPU will trap because a `csp`-relative load fails because `csp` (the capability stack pointer register) is untagged.
117
120
This transitions to the switcher.
118
121
The switcher will then find that you need to run the stackless error handler (because `csp` is untagged).
119
-
The switcher will then look at the trusted stack and rederive the `csp` value that you had on entry.
122
+
The switcher will then look at the trusted stack and rederive the `csp` value that your compartment had on entry.
120
123
121
124
The stackless error handler *does* have a stack, but it doesn't have a *stack frame*.
122
125
When it's invoked, it can guarantee that `csp` is a capability that authorises access to the stack, but it can't guarantee anything about the address of that capability (other than that it will be in bounds).
@@ -138,7 +141,7 @@ C programmers have to use the macro-based version, which looks like this:
138
141
```c
139
142
CHERIOT_DURING
140
143
{
141
-
// Do some things that may cause a
144
+
// Do some things that may cause a crash
142
145
}
143
146
CHERIOT_HANDLER
144
147
{
@@ -166,7 +169,7 @@ Hopefully this is much easier than writing a custom error handler that detected
166
169
167
170
This isn't a full exception model.
168
171
In particular, there's currently no way in the handler of seeing the cause of the fault.
169
-
In a future iteration, we'll add something like 'Herbceptions': Herb Sutter's proposal for C++ where exceptions must be preallocated objects or primitive values.
172
+
In a future iteration, we'll add something like 'Herbceptions': Herb Sutter's [proposal for C++ where exceptions must be preallocated objects or primitive values](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf).
170
173
171
174
This mechanism is designed to be simple and lightweight, not to be as generic as something that you'd build on a complex system.
172
175
You can protect something against faults with a single C++ wrapper function and 40 bytes of stack space.
0 commit comments