You know how async methods that await something incomplete end up creating a few objects, right? There's
the boxed state machine, an Action that moves it forward, a Task[<T>], etc - right?
Well... what about if there just wasn't?
And what if all you had to do was change your async ValueTask<int> method to async PooledValueTask<int>?
And I hear you; you're saying "but I can't change the public API!". But what if a PooledValueTask<int> really was
a ValueTask<int>? So you can just cheat:
public ValueTask<int> DoTheThing() // the outer method is not async
{
return ReallyDoTheThing(this);
static async PooledValueTask<int> ReallyDoTheThing(SomeType obj)
{
... await ...
// (use obj.* instead of this.*)
... return ...
}
}(the use of a static local function here avoids a <>c__DisplayClass wrapper from how the local-function capture context is implemented by the compiler)
And how about if maybe just maybe in the future it could be (if this happens) just:
[SomeKindOfAttribute] // <=== this is the only change
public async ValueTask<int> DoTheThing()
{
// no changes here at all
}(although note that in some cases it can work better with the static trick, as above)
Would that be awesome? Because that's what this is!
The PooledValueTask[<T>] etc exist mostly to define a custom builder. The builder in this library uses aggressive pooling of classes
that replace the boxed approach used by default; we recycle them when the state machine completes.
It also makes use of the IValueTaskSource[<T>] API to allow incomplete operations to be represented without a Task[<T>], but with a custom backer.
And we pool that too, recycling it when the task is awaited. The only downside: you can't await the same result twice now, because
once you've awaited it the first time, it has gone. A cycling token is used to make sure you can't accidentally read the incorrect
values after the result has been awaited.
We can even do this for Task[<T>], except here we can only avoid the boxed state machine; hence PooledTask[<T>] exists too. No custom backing in this case, though, since a Task[<T>] will
need to be allocated (except for Task.CompletedTask, which we special-case).
Based on an operation that uses Task.Yield() to ensure that the operations are incomplete; ".NET" means the inbuilt out-of-the box implementation; "Pooled" means the implementation from this library.
In particular, notice:
- zero allocations for
PooledValueTask[<T>]vsValueTask[<T>](on .NET Core; significantly reduced on .NET Framework) - reduced allocations for
PooledTask[<T>]vsTask[<T>] - no performance degredation; just lower allocations
| Method | Job | Runtime | Categories | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |----- |-------- |------------- |---------:|----------:|----------:|-------:|-------:|-------:|----------:|
| .NET | Clr | Clr | Task<T> | 2.159 us | 0.0427 us | 0.0474 us | 0.0508 | 0.0039 | - | 344 B |
| Pooled | Clr | Clr | Task<T> | 2.037 us | 0.0246 us | 0.0230 us | 0.0273 | 0.0039 | - | 182 B |
| .NET | Core | Core | Task<T> | 1.397 us | 0.0024 us | 0.0022 us | 0.0176 | - | - | 120 B |
| Pooled | Core | Core | Task<T> | 1.349 us | 0.0058 us | 0.0054 us | 0.0098 | - | - | 72 B |
| | | | | | | | | | | |
| .NET | Clr | Clr | Task | 2.065 us | 0.0200 us | 0.0167 us | 0.0508 | 0.0039 | - | 336 B |
| Pooled | Clr | Clr | Task | 1.979 us | 0.0179 us | 0.0167 us | 0.0273 | 0.0039 | - | 182 B |
| .NET | Core | Core | Task | 1.390 us | 0.0159 us | 0.0149 us | 0.0176 | - | - | 112 B |
| Pooled | Core | Core | Task | 1.361 us | 0.0055 us | 0.0051 us | 0.0098 | - | - | 72 B |
| | | | | | | | | | | |
| .NET | Clr | Clr | ValueTask<T> | 2.087 us | 0.0403 us | 0.0431 us | 0.0547 | 0.0078 | 0.0039 | 352 B |
| Pooled | Clr | Clr | ValueTask<T> | 1.924 us | 0.0248 us | 0.0220 us | 0.0137 | 0.0020 | - | 100 B |
| .NET | Core | Core | ValueTask<T> | 1.405 us | 0.0078 us | 0.0073 us | 0.0195 | - | - | 128 B |
| Pooled | Core | Core | ValueTask<T> | 1.374 us | 0.0116 us | 0.0109 us | - | - | - | - |
| | | | | | | | | | | |
| .NET | Clr | Clr | ValueTask | 2.056 us | 0.0206 us | 0.0183 us | 0.0508 | 0.0039 | - | 344 B |
| Pooled | Clr | Clr | ValueTask | 1.948 us | 0.0388 us | 0.0416 us | 0.0137 | 0.0020 | - | 100 B |
| .NET | Core | Core | ValueTask | 1.408 us | 0.0140 us | 0.0117 us | 0.0176 | - | - | 120 B |
| Pooled | Core | Core | ValueTask | 1.366 us | 0.0039 us | 0.0034 us | - | - | - | - |Note that most of the remaining allocations are actually the work-queue internals of Task.Yield() (i.e. how
ThreadPool.QueueUserWorkItem works) - we've removed virtually all of the unnecessary overheads that came from the
async machinery. Most real-world scenarios aren't using Task.Yield() - they are waiting on external data, etc - so
they won't see these. Plus they are effectively zero on .NET Core 3.
The tests do the exact same thing; the only thing that changes is the return type, i.e. whether it is
async Task<int>, async ValueTask<int>, async PooledTask<int> or async PooledValueTask<int>.
All of them have the same threading/execution-context/sync-context semantics; there's no cheating going on.