ESQL: Implement percentiles agg for exponential_histogram #137720

JonasKunz · 2025-11-07T11:08:58Z

Implements PERCENTILE agg support for exponential_histogram (part of #137549):

Adds a new MERGE aggregation to merge exponential histograms. This aggregation is intentionally not registered and can only be used in surrogates
Adds a HISTOGRAM_PERCENTILE(MERGE(field), percentile) surrogate to implement PERCENTILES.

This time I remembered to also run the tests locally in non-snapshot mode, so something like #137693 shouldn't happen again.

Looking at the logical plans for the CSV tests also shows the performance benefit of splitting the percentile extraction and the merge aggregation:

Project[[instance{f}#29, p0{r}#6, p50{r}#18, p99{r}#21, p100{r}#15]]
\_TopN[[Order[instance{f}#29,ASC,LAST]],1000[INTEGER],false]
  \_Eval[[HISTOGRAMPERCENTILE($$MERGE$p0$0{r$}#32,0[INTEGER]) AS p0#6,
          HISTOGRAMPERCENTILE($$MERGE$p0$0{r$}#32,50[INTEGER]) AS p50#9,
          HISTOGRAMPERCENTILE($$MERGE$p0$0{r$}#32,99[INTEGER]) AS p99#12,
          HISTOGRAMPERCENTILE($$MERGE$p0$0{r$}#32,100[INTEGER]) AS p100#15,
          ROUND(p50{r}#9,7[INTEGER]) AS p50#18,
          ROUND(p99{r}#12,7[INTEGER]) AS p99#21]]
    \_Aggregate[[instance{f}#29],[MERGE(responseTime{f}#31,true[BOOLEAN],PT0S[TIME_DURATION]) AS $$MERGE$p0$0#32, instance{f}#29]]
      \_Filter[NOT(STARTSWITH(instance{f}#29,dummy[KEYWORD]))]
        \_EsRelation[exp_histo_sample][@timestamp{f}#30, instance{f}#29, responseTime{f}#3..]

Even though we query four percentiles, we only do the merge aggregation once. This would not be the case if merge + percentile extraction were fused in a single aggregation.

@kkrik-es @not-napoleon : I intended to add you as optional reviewers. So feel free to ignore the request.

elasticsearchmachine · 2025-11-07T15:38:59Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

kkrik-es · 2025-11-11T07:32:05Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/exponential_histogram.csv-spec

+instance:keyword | p0:double | p50:double | p99:double | p100:double
+instance-0       | 2.4E-4    | 0.0211404  | 1.0432946  | 6.786232
+instance-1       | 2.17E-4   | 6.469E-4   | 0.1422151  | 3.190723
+instance-2       | 2.2E-4    | 6.469E-4   | 0.0857672  | 2.7059714542564097


nit: round p100 too?

That's not necessary, because p0 is the min and p100 is the max. Those are exact across machines and merging

kkrik-es · 2025-11-11T07:36:30Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/AggregateMapper.java

            case DOC -> DataType.DOC_DATA_TYPE;
-            case FLOAT, NULL, COMPOSITE, AGGREGATE_METRIC_DOUBLE, EXPONENTIAL_HISTOGRAM, UNKNOWN -> throw new EsqlIllegalArgumentException(
+            case EXPONENTIAL_HISTOGRAM -> DataType.EXPONENTIAL_HISTOGRAM;
+            case FLOAT, NULL, COMPOSITE, AGGREGATE_METRIC_DOUBLE, UNKNOWN -> throw new EsqlIllegalArgumentException(


Fix also AGGREGATE_METRIC_DOUBLE? We have a type for it, no?

We do, but we don't have any non-surrogate aggregations for AGGREGATE_METRIC_DOUBLE yet. So this would be effectively dead and untested code, which I would want to avoid adding right now. I think we should only add it when we actually add a non-surrogate aggregation for aggregate metric double.

kkrik-es

LGTM, leaving it to Nhat to stamp.

JonasKunz · 2025-11-07T14:25:53Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/exponential_histogram.csv-spec

+
+FROM exp_histo_sample | WHERE NOT STARTS_WITH(instance, "dummy") 
+ | STATS p0 = PERCENTILE(responseTime,0),  p50 = PERCENTILE(responseTime,50),  p99 = PERCENTILE(responseTime, 99),  p100 = PERCENTILE(responseTime,100)
+ | EVAL p50 = ROUND(p50, 7), p99 = ROUND(p99, 7) // rounding to avoid floating point precision issues, min and max are exact so no rounding needed


I had an interesting test failure where the percentiles where a tiny, tiny bit off the exact values on CI.
This also wouldn't reproduce locally with the same seed.
My guess here is that this is due to the different implementation of Math.log / Math.exp on my machine (Mac M1) vs the CI runners. See also this SO thread

For that reason I've added the rounding to make the tests more robust.

JonasKunz · 2025-11-07T15:31:18Z

...l/compute/src/main/java/org/elasticsearch/compute/data/ExponentialHistogramBlockBuilder.java

        // We should add a dedicated encoding when building a block from computed histograms which do not originate from doc values
        // That encoding should be optimized for speed and support storing the zero threshold as (scale, index) pair
        ZeroBucket zeroBucket = histogram.zeroBucket();
-        assert zeroBucket.compareZeroThreshold(ZeroBucket.minimalEmpty()) == 0 || zeroBucket.isIndexBased() == false


I have removed this assert, because this can now actually occur: We merge histograms, which means the zero-threshold of the result can be defined by a (scale, index) tuple instead of a double value. So when storing that merged histogram into the block, we have to "round" that to the nearest double, introducing a tiny bit of error.

This won't matter in practice because the error is tiny and we are only estimating percentiles anyway, but the errors could add up. I say this is okay for a tech preview, but the TODO above still stands that we should fix this latest before moving out of tech preview.

Ideally we can add some benchmarks first, so that we can also see if the encoding intended for disk-storage performed below is too costly (=it appears in the flame graphs) and we should do something special or not.

JonasKunz · 2025-11-11T08:33:00Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/exponential_histogram.csv-spec

+instance:keyword | p0:double | p50:double | p99:double | p100:double
+instance-0       | 2.4E-4    | 0.0211404  | 1.0432946  | 6.786232
+instance-1       | 2.17E-4   | 6.469E-4   | 0.1422151  | 3.190723
+instance-2       | 2.2E-4    | 6.469E-4   | 0.0857672  | 2.7059714542564097


That's not necessary, because p0 is the min and p100 is the max. Those are exact across machines and merging

JonasKunz · 2025-11-11T09:18:49Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/AggregateMapper.java

            case DOC -> DataType.DOC_DATA_TYPE;
-            case FLOAT, NULL, COMPOSITE, AGGREGATE_METRIC_DOUBLE, EXPONENTIAL_HISTOGRAM, UNKNOWN -> throw new EsqlIllegalArgumentException(
+            case EXPONENTIAL_HISTOGRAM -> DataType.EXPONENTIAL_HISTOGRAM;
+            case FLOAT, NULL, COMPOSITE, AGGREGATE_METRIC_DOUBLE, UNKNOWN -> throw new EsqlIllegalArgumentException(


We do, but we don't have any non-surrogate aggregations for AGGREGATE_METRIC_DOUBLE yet. So this would be effectively dead and untested code, which I would want to avoid adding right now. I think we should only add it when we actually add a non-surrogate aggregation for aggregate metric double.

JonasKunz added 3 commits November 7, 2025 10:22

Add exponential histogram merge aggregation

13f7c44

Implemented unit tests

e731a32

Implemented percentile agg as surrogate

59fa978

JonasKunz added the >non-issue label Nov 7, 2025

elasticsearchmachine added v9.3.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Nov 7, 2025

elasticsearchmachine and others added 7 commits November 7, 2025 11:15

[CI] Auto commit changes from spotless

5e2e248

fix checkstyle

8b4d0a3

Add CSV tests

cc19f26

Add rounding to CSV tests to avoid floating point trouble

001d7b1

Add javadoc

b026eba

use separate capability for percentiles to make bwc tests happy

f6601f1

Merge branch 'main' into esql-exp-histo-merge

67f5622

JonasKunz marked this pull request as ready for review November 7, 2025 15:34

JonasKunz requested a review from dnhatn November 7, 2025 15:34

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Nov 7, 2025

JonasKunz requested review from kkrik-es and not-napoleon November 7, 2025 15:34

JonasKunz added :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels Nov 7, 2025

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 7, 2025

kkrik-es reviewed Nov 11, 2025

View reviewed changes

JonasKunz commented Nov 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ESQL: Implement percentiles agg for exponential_histogram #137720

ESQL: Implement percentiles agg for exponential_histogram #137720

Uh oh!

JonasKunz commented Nov 7, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Nov 7, 2025

Uh oh!

kkrik-es Nov 11, 2025

Uh oh!

JonasKunz Nov 11, 2025

Uh oh!

kkrik-es Nov 11, 2025

Uh oh!

JonasKunz Nov 11, 2025

Uh oh!

kkrik-es left a comment

Uh oh!

JonasKunz Nov 7, 2025

Uh oh!

JonasKunz Nov 7, 2025

Uh oh!

JonasKunz Nov 11, 2025

Uh oh!

JonasKunz Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ESQL: Implement percentiles agg for exponential_histogram #137720

Are you sure you want to change the base?

ESQL: Implement percentiles agg for exponential_histogram #137720

Uh oh!

Conversation

JonasKunz commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kkrik-es left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JonasKunz commented Nov 7, 2025 •

edited

Loading