Skip to content

Conversation

@geekswaroop
Copy link

@geekswaroop geekswaroop commented Nov 12, 2025

Description

TLDR
With this change, services will spend less CPU cycles when getting the header names for a given yarpc call.

What
This diff introduces HeaderNamesAll() for the Call struct and All() for the Header struct. This returns an iterator to the caller.

How
Uses seq2 iterators over manual looping + pre-sizing slice + sorting.

How to use
With this change, services using yarpc have a work around for iterating through the list of HeaderNamesIter with something like this:

call := yarpc.CallFromContext(c) 
for headerName := range call.HeaderNamesIter() {
       // do something with the headerName
}

Best Practices
If you don't need a sorted list of header strings, and just want to iterate through the header names, this API is for you.


Type of Change

[ ] Breaking API Change
[ ] Breaking Semantic or Runtime Change
[x] Internal Implementation Change
[ ] Dependency Update (please also multi-select on whether this upgrade is breaking)
[ ] Code clean up / Refactoring
[ ] No code change (e.g. adding documentation)


Risk Level

[ ] HIGH
[ ] MED
[x] LOW

Reasoning:
Change are related to adding a new iterator method. Any breakage will be caught during compilation and CI.


Potential Impact of Failure

Low.

Reasoning:
Change are related to a new iterator method. Any breakage will be caught during compilation and CI. This change will not impact at runtime even if it is not caught in CI.

Benchmarks

Benchmarking shows significant reduction in latencies, allocations and memory when there's more than 1 headers, when using iterators

 % benchstat old.txt new.txt
goos: linux
goarch: amd64
cpu: AMD EPYC 7B13
                                        │    old.txt    │                new.txt                │
                                        │    sec/op     │    sec/op      vs base                │
CallHeaderNames/HeaderNames/size=1-48      96.14n ±  3%   141.35n ±  7%  +47.03% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=2-48      157.8n ±  5%    145.5n ±  9%   -7.79% (p=0.023 n=10)
CallHeaderNames/HeaderNames/size=3-48      238.3n ±  4%    156.3n ±  5%  -34.39% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=4-48      265.1n ± 10%    166.8n ± 16%  -37.07% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=5-48      386.5n ±  5%    166.2n ±  2%  -57.00% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=10-48     755.5n ± 17%    245.0n ±  5%  -67.57% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=25-48    1741.5n ± 10%    432.2n ±  3%  -75.18% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=50-48    4446.5n ±  5%    685.5n ±  5%  -84.58% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=100-48   11.086µ ± 25%    1.582µ ± 22%  -85.73% (p=0.000 n=10)
geomean                                    660.8n          279.0n        -57.78%

                                        │   old.txt    │               new.txt               │
                                        │     B/op     │    B/op     vs base                 │
CallHeaderNames/HeaderNames/size=1-48       16.00 ± 0%   40.00 ± 0%  +150.00% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=2-48       48.00 ± 0%   40.00 ± 0%   -16.67% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=3-48      112.00 ± 0%   40.00 ± 0%   -64.29% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=4-48      112.00 ± 0%   40.00 ± 0%   -64.29% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=5-48      240.00 ± 0%   40.00 ± 0%   -83.33% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=10-48     496.00 ± 0%   40.00 ± 0%   -91.94% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=25-48    1008.00 ± 0%   40.00 ± 0%   -96.03% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=50-48    2160.00 ± 0%   40.00 ± 0%   -98.15% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=100-48   4464.00 ± 0%   40.00 ± 0%   -99.10% (p=0.000 n=10)
geomean                                     281.6        40.00        -85.80%

                                        │  old.txt   │                new.txt                │
                                        │ allocs/op  │ allocs/op   vs base                   │
CallHeaderNames/HeaderNames/size=1-48     1.000 ± 0%   3.000 ± 0%  +200.00% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=2-48     2.000 ± 0%   3.000 ± 0%   +50.00% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=3-48     3.000 ± 0%   3.000 ± 0%         ~ (p=1.000 n=10) ¹
CallHeaderNames/HeaderNames/size=4-48     3.000 ± 0%   3.000 ± 0%         ~ (p=1.000 n=10) ¹
CallHeaderNames/HeaderNames/size=5-48     4.000 ± 0%   3.000 ± 0%   -25.00% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=10-48    5.000 ± 0%   3.000 ± 0%   -40.00% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=25-48    6.000 ± 0%   3.000 ± 0%   -50.00% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=50-48    7.000 ± 0%   3.000 ± 0%   -57.14% (p=0.000 n=10)
CallHeaderNames/HeaderNames/size=100-48   8.000 ± 0%   3.000 ± 0%   -62.50% (p=0.000 n=10)
geomean                                   3.671        3.000        -18.27%
¹ all samples are equal

Reasoning

For a single header, HeaderNamesIter shows a 47% latency increase (96ns → 141ns), 150% more memory usage (16B → 40B), and 200% more allocations (1 → 3) compared to HeaderNames because the iterator's fixed overhead dominates at this scale: while HeaderNames simply allocates a single-element slice and returns it with no sorting cost, HeaderNamesIter must create three closures (the method's return closure capturing the Call receiver, the Headers.ItemsIter closure capturing the map, and the iterator state machine), execute multiple function calls through these closures, and perform yield checks - overhead that only becomes worthwhile when amortized across larger header counts where HeaderNames would suffer from O(n log n) sorting costs and growing slice allocations.

Test

CI tests here.

@CLAassistant
Copy link

CLAassistant commented Nov 12, 2025

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants