Skip to content

Conversation

qmfrederik
Copy link
Collaborator

No description provided.

@davidchisnall
Copy link
Member

We have a benchmark like that in one of the tests (conditionally compiled, off by default). It's a bit misleading because it's always hitting the same cache line and will always hit in the branch predictor. Last time I did some deep analysis, it was taking around 4 cycles on a Haswell chip to go through the entire message-send machinery. That drops off sharply if we miss in the L1 for the dtable.

Running benchmarks in CI on GitHub Action Runners is not very useful because they'll end up on a random VM (where CPUID lies and may pretend to be an older core than it is), so may vary hugely with no code changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants