Skip to content

Commit 319e0d4

Browse files
committed
Remove infill 0 buckets from custom distributions
1 parent 6475939 commit 319e0d4

File tree

5 files changed

+26
-24
lines changed

5 files changed

+26
-24
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
[Full changelog](https://github.com/mozilla/glean/compare/v65.2.1...main)
44

5+
* General
6+
* BREAKING: Remove infill 0 buckets from custom distributions ([#3246](https://github.com/mozilla/glean/pull/3246))
57
* Swift
68
* Make `EventMetricType`, `ObjectMetricType`, `URLMetricType` and `Ping` `Sendable` ([#3255](https://github.com/mozilla/glean/pull/3255))
79
* Glean for iOS is now being built with Xcode 16.4 ([#3270](https://github.com/mozilla/glean/pull/3270))

docs/dev/core/internal/payload.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -228,8 +228,8 @@ A [Custom distribution](../../../book/reference/metrics/custom_distribution.md)
228228
| `sum` | Integer | The sum of all recorded values. |
229229
| `values` | Map<String, Integer> | The values in each bucket. The key is the minimum value for the range of that bucket. |
230230

231-
A contiguous range of buckets is always sent, so that the server can aggregate and visualize distributions, without knowing anything about the specific bucketing function used.
232-
This range starts with the first bucket (as specified in the `range_min` parameter), and ends at one bucket beyond the last bucket with a non-zero accumulation (so that the upper bound on the last bucket is retained).
231+
From Glean v66.0.0 on the `values` only includes filled buckets.
232+
Empty buckets are not sent.
233233

234234
For example, suppose you had a custom distribution defined by the following parameters:
235235

@@ -241,23 +241,21 @@ For example, suppose you had a custom distribution defined by the following para
241241
The following shows the recorded values vs. what is sent in the payload.
242242

243243
```
244-
recorded: 12: 2, 22: 1
245-
sent: 10: 0, 12: 2, 14: 0, 17: 0, 19: 0, 22: 1, 24: 0
244+
recorded: 12: 2, 22: 1
245+
sent: 12: 2, 22: 1
246246
```
247247

248+
Up until Glean v65.0.3 a contiguous range of buckets was always sent, so that the server can aggregate and visualize distributions, without knowing anything about the specific bucketing function used.
249+
This range started with the first bucket (as specified in the `range_min` parameter), and ends at one bucket beyond the last bucket with a non-zero accumulation (so that the upper bound on the last bucket is retained).
250+
248251
#### Example:
249252

250253
```json
251254
{
252255
"sum": 3,
253256
"values": {
254-
"10": 0,
255257
"12": 2,
256-
"14": 0,
257-
"17": 0,
258-
"19": 0,
259258
"22": 1,
260-
"24": 0
261259
}
262260
}
263261
```

glean-core/src/histogram/exponential.rs

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ fn exponential_range(min: u64, max: u64, bucket_count: usize) -> Vec<u64> {
5757
///
5858
/// Buckets are pre-computed at instantiation with an exponential distribution from `min` to `max`
5959
/// and `bucket_count` buckets.
60-
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, MallocSizeOf)]
60+
#[derive(Debug, Clone, Serialize, Deserialize, Eq, MallocSizeOf)]
6161
pub struct PrecomputedExponential {
6262
// Don't serialize the (potentially large) array of ranges, instead compute them on first
6363
// access.
@@ -68,6 +68,14 @@ pub struct PrecomputedExponential {
6868
pub(crate) bucket_count: usize,
6969
}
7070

71+
impl PartialEq for PrecomputedExponential {
72+
fn eq(&self, other: &Self) -> bool {
73+
// `bucket_ranges` are computed on-demand based on `min`, `max` and `bucket_count`,
74+
// so we don't need to compare it (and thus forcing it to be computed)
75+
self.min == other.min && self.max == other.max && self.bucket_count == other.bucket_count
76+
}
77+
}
78+
7179
impl Bucketing for PrecomputedExponential {
7280
/// Get the bucket for the sample.
7381
///

glean-core/src/histogram/linear.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ fn linear_range(min: u64, max: u64, count: usize) -> Vec<u64> {
3737
///
3838
/// Buckets are pre-computed at instantiation with a linear distribution from `min` to `max`
3939
/// and `bucket_count` buckets.
40-
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, MallocSizeOf)]
40+
#[derive(Debug, Clone, Serialize, Deserialize, Eq, MallocSizeOf)]
4141
pub struct PrecomputedLinear {
4242
// Don't serialize the (potentially large) array of ranges, instead compute them on first
4343
// access.
@@ -48,6 +48,12 @@ pub struct PrecomputedLinear {
4848
pub(crate) bucket_count: usize,
4949
}
5050

51+
impl PartialEq for PrecomputedLinear {
52+
fn eq(&self, other: &Self) -> bool {
53+
self.min == other.min && self.max == other.max && self.bucket_count == other.bucket_count
54+
}
55+
}
56+
5157
impl Bucketing for PrecomputedLinear {
5258
/// Get the bucket for the sample.
5359
///

glean-core/src/histogram/mod.rs

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -124,19 +124,7 @@ impl<B: Bucketing> Histogram<B> {
124124
/// Gets a snapshot of all values from the first bucket until one past the last filled bucket,
125125
/// filling in empty buckets with 0.
126126
pub fn snapshot_values(&self) -> HashMap<u64, u64> {
127-
let mut res = self.values.clone();
128-
129-
let max_bucket = self.values.keys().max().cloned().unwrap_or(0);
130-
131-
for &min_bucket in self.bucketing.ranges() {
132-
// Fill in missing entries.
133-
let _ = res.entry(min_bucket).or_insert(0);
134-
// stop one after the last filled bucket
135-
if min_bucket > max_bucket {
136-
break;
137-
}
138-
}
139-
res
127+
self.values.clone()
140128
}
141129

142130
/// Clear this histogram.

0 commit comments

Comments
 (0)