Skip to content

Commit c77fd07

Browse files
dashpoleMrAliascarlosalberto
authored
Recommend using a time-unbiased reservoir sampling algorithm for histograms (#4678)
Fixes #4675 ## Changes Change the recommended algorithm for histogram reservoirs to be time-unbiased. I've left the previous algorithm as an option to ensure this change is backwards-compatible. Go prototype: open-telemetry/opentelemetry-go#7458 * [x] Links to the prototypes (when adding or changing features) * [x] [`CHANGELOG.md`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/CHANGELOG.md) file updated for non-trivial changes --------- Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Carlos Alberto Cortez <[email protected]>
1 parent 110aed2 commit c77fd07

File tree

2 files changed

+16
-4
lines changed

2 files changed

+16
-4
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ release.
1313

1414
### Metrics
1515

16+
- `AlignedHistogramBucketExemplarReservoir` SHOULD use a time-weighted algorithm.
17+
([#4678](https://github.com/open-telemetry/opentelemetry-specification/pull/4678))
18+
1619
### Logs
1720

1821
### Baggage

specification/metrics/sdk.md

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1203,6 +1203,7 @@ algorithm](https://en.wikipedia.org/wiki/Reservoir_sampling) can be used:
12031203
if bucket < num_buckets then
12041204
reservoir[bucket] = measurement
12051205
end
1206+
num_measurements_seen += 1
12061207
```
12071208

12081209
Any stateful portion of sampling computation SHOULD be reset every collection
@@ -1217,15 +1218,23 @@ contention. Otherwise, a default size of `1` SHOULD be used.
12171218
#### AlignedHistogramBucketExemplarReservoir
12181219

12191220
This Exemplar reservoir MUST take a configuration parameter that is the
1220-
configuration of a Histogram. This implementation MUST keep the last seen
1221-
measurement that falls within a histogram bucket. The reservoir will accept
1222-
measurements using the equivalent of the following naive algorithm:
1221+
configuration of a Histogram. This implementation MUST store at most one
1222+
measurement that falls within a histogram bucket, and SHOULD use a
1223+
uniformly-weighted sampling algorithm based on the number of measurements the
1224+
bucket has seen so far to determine if the offered measurements should be
1225+
sampled. Alternatively, the implementation MAY instead keep the last seen
1226+
measurement that falls within a histogram bucket.
1227+
1228+
The reservoir will accept measurements using the equivalent of the following
1229+
naive algorithm:
12231230

12241231
```
12251232
bucket = find_histogram_bucket(measurement)
1226-
if bucket < num_buckets then
1233+
num_measurements_seen_bucket = num_measurements_seen[bucket]
1234+
if random_integer(0, num_measurements_seen_bucket) == 0 then
12271235
reservoir[bucket] = measurement
12281236
end
1237+
num_measurements_seen[bucket] += 1
12291238
12301239
def find_histogram_bucket(measurement):
12311240
for boundary, idx in bucket_boundaries do

0 commit comments

Comments
 (0)