Skip to content

Conversation

dol
Copy link
Contributor

@dol dol commented Jun 11, 2025

This pull request introduces a new CelBasedSampler to the OpenTelemetry Java SDK, enabling advanced sampling rules using the Common Expression Language (CEL). It also includes updates to the existing RuleBasedRoutingSampler for consistency and clarity. Below is a summary of the most important changes grouped by theme:

New CelBasedSampler Implementation:

  • Added the CelBasedSampler class, which evaluates CEL expressions to make sampling decisions based on span attributes. It includes a fallback sampler and supports multiple expressions. (samplers/src/main/java/io/opentelemetry/contrib/sampler/CelBasedSampler.java)
  • Introduced the CelBasedSamplerBuilder to construct CelBasedSampler instances with methods for adding expressions and specifying actions (e.g., DROP or RECORD_AND_SAMPLE). (samplers/src/main/java/io/opentelemetry/contrib/sampler/CelBasedSamplerBuilder.java)
  • Added the CelBasedSamplingExpression class to encapsulate individual CEL expressions and their associated samplers. (samplers/src/main/java/io/opentelemetry/contrib/sampler/CelBasedSamplingExpression.java)
  • Implemented a declarative configuration provider for CelBasedSampler, enabling configuration through YAML files. (samplers/src/main/java/io/opentelemetry/contrib/sampler/internal/CelBasedSamplerComponentProvider.java)
# The fallback sampler to use if no expressions match.
fallback_sampler:
  always_on:
# List of CEL expressions to evaluate. Expressions are evaluated in order.
expressions:
  # The action to take when the expression evaluates to true. Must be one of: DROP, RECORD_AND_SAMPLE.
  - action: DROP
    # The CEL expression to evaluate. Must return a boolean.
    expression: attribute['url.path'].startsWith('/actuator')
  - action: RECORD_AND_SAMPLE
    expression: attribute['http.method'] == 'GET' && attribute['http.status_code'] < 400

Documentation Updates:

  • Updated samplers/README.md to document the new CelBasedSampler, including its usage, schema, and example configurations. (samplers/README.md) [1] [2]

Dependency Additions:

  • Added a dependency on the dev.cel:cel:0.9.0 library to enable CEL expression evaluation. (samplers/build.gradle.kts)

Updates to RuleBasedRoutingSampler:

  • Renamed SamplingRule to RuleBasedRoutingSamplingRule for clarity and updated all related references in the RuleBasedRoutingSampler and its builder. (samplers/src/main/java/io/opentelemetry/contrib/sampler/RuleBasedRoutingSampler.java, samplers/src/main/java/io/opentelemetry/contrib/sampler/RuleBasedRoutingSamplerBuilder.java, samplers/src/main/java/io/opentelemetry/contrib/sampler/RuleBasedRoutingSamplingRule.java) [1] [2] [3] [4] [5] [6] [7]

These changes collectively enhance the SDK's sampling capabilities, allowing users to define sophisticated sampling rules using CEL while maintaining compatibility with existing samplers.

@dol dol requested a review from a team June 11, 2025 15:28
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jun 11, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@trask
Copy link
Member

trask commented Jun 11, 2025

@dol this looks nice!

@johnbley what do you think of CEL vs JEXL in the context of open-telemetry/opentelemetry-java-instrumentation#13590?

@dol
Copy link
Contributor Author

dol commented Jun 12, 2025

Just a note to myself. I need to expand the test cases with the following addition.

  • Define multiple expressions to test the handling and ordering of expressions
    • Add the same expression with DROP and RECORD_AND_SAMPLE. The first expression and action should be take affect.
    • Add more complex expressions that pass the compile step and should work in the verify step like attribute['http.status_code'] < 400 and time based example to sample outside of business hours

@johnbley
Copy link
Member

@dol this looks nice!

@johnbley what do you think of CEL vs JEXL in the context of open-telemetry/opentelemetry-java-instrumentation#13590?

Very interesting! I'll take a look at it!

@dol
Copy link
Contributor Author

dol commented Jun 12, 2025

Just a note to myself. I need to expand the test cases with the following addition.

* Define multiple expressions to test the handling and ordering of expressions
  
  * Add the same expression with DROP and RECORD_AND_SAMPLE. The first expression and action should be take affect.
  * Add more complex expressions that pass the compile step and should work in the verify step like `attribute['http.status_code'] < 400` and time based example to sample outside of business hours

As I refined the tests I found some edge cases in the setup and mapping of the CEL engine. I'll mark the PR as work in progress. Sorry for that.

@dol dol changed the title Common Expression Language (CEL) sampler [WIP] Common Expression Language (CEL) sampler Jun 12, 2025
@trask
Copy link
Member

trask commented Jun 12, 2025

I'll mark the PR as work in progress. Sorry for that.

no worries! you can click "Convert to draft" (hidden under the list of reviewers), and then when you're ready you can click "Ready for review"

@dol dol marked this pull request as draft June 12, 2025 15:51
@dol
Copy link
Contributor Author

dol commented Jun 18, 2025

@trask @jack-berg : I wanted you to give an update on the PR. As I was adding more tests cases to PR I encountered a problem with single/double quote parsing on the declarative config. First I though it's an issue with a complex expression input. But after digging deep into the source code of this project and opentelementry-java I figured out, that the https://opentelemetry.io/docs/specs/otel/configuration/data-model/#environment-variable-substitution implementation on top of snakeyaml can not handle a mix of single and double quotes very well.

I created the following bug report: open-telemetry/opentelemetry-java#7429

As the chances are very high that a complex expression will need to mix single and double quotes, this bug should be addressed first.

@dol
Copy link
Contributor Author

dol commented Jun 19, 2025

open-telemetry/opentelemetry-java#7433 should fix this regression.

@dol dol force-pushed the feature/cel-sampler branch from c21379b to a6e408c Compare June 23, 2025 22:09
@dol
Copy link
Contributor Author

dol commented Jun 23, 2025

The latest version added more unit tests and better coverage. I think the new CEL sampler is ready for an review.
The tests still require the open-telemetry/opentelemetry-java#7433 to be approved, merged and released.
Until then the build and tests will fail.

@dol dol changed the title [WIP] Common Expression Language (CEL) sampler Common Expression Language (CEL) sampler Jun 25, 2025
@dol dol marked this pull request as ready for review June 25, 2025 05:48
@breedx-splk
Copy link
Contributor

Build is pretty broken. Can you look at this @dol thanks!

@dol
Copy link
Contributor Author

dol commented Jul 7, 2025

The latest version added more unit tests and better coverage. I think the new CEL sampler is ready for an review. The tests still require the open-telemetry/opentelemetry-java#7433 to be approved, merged and released. Until then the build and tests will fail.

@breedx-splk I'm waiting for the v1.52.0 release, which should be released soon. This will fix all the broken tests.

@dol
Copy link
Contributor Author

dol commented Aug 6, 2025

@jack-berg @trask The PR is ready for review after the fix was merged and release ( open-telemetry/opentelemetry-java#7433 ) and the BOM version was bumped ( 850933f#diff-df7d51fc1db73056c56958a9784e26310dae8ec239fb940820f5ddea4b655693L5 )

Copy link
Contributor

@breedx-splk breedx-splk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this is an awesome addition that users will start trying immediately. I left a few comments/questions that I'd like to see addressed...but this is looking solid.

@dol dol requested a review from a team as a code owner September 2, 2025 21:16
@dol dol force-pushed the feature/cel-sampler branch from 029ff17 to 3328c31 Compare September 2, 2025 21:17
@dol
Copy link
Contributor Author

dol commented Sep 2, 2025

@breedx-splk Thank you for the valuable review. I addressed all the remarks.
The newest versions returns the detailed message if the CEL compiles is unable to parse the expression.

@dol dol force-pushed the feature/cel-sampler branch from 50b7e21 to e197503 Compare September 30, 2025 21:02
@breedx-splk
Copy link
Contributor

@dol looks like there are some merge conflicts now...

@dol dol force-pushed the feature/cel-sampler branch from e197503 to f80ebb1 Compare October 1, 2025 18:46
@dol
Copy link
Contributor Author

dol commented Oct 1, 2025

@dol looks like there are some merge conflicts now...

@breedx-splk Thx for the hint. The merge conflict is resolved.

@trask
Copy link
Member

trask commented Oct 1, 2025

I've added this PR as a topic for tomorrow's Java SIG meeting to get a few more eyes on it, in particular since this module is automatically pulled in and available in the OpenTelemetry Java agent (@dol you're of course welcome to join if you'd like!)

+ "localParentNotSampled:AlwaysOffSampler"
+ "}");

// SERVER span to other path should be recorded and sampled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one matches the first expression?

Suggested change
// SERVER span to other path should be recorded and sampled
// SERVER span with host header and 200 response is dropped due to matching with first expression.

@johnbley
Copy link
Member

johnbley commented Oct 1, 2025

I'm not against putting this into contrib at all but wonder if a better long-term approach would be to have a Java implementation of an ottl engine (or a large part of one) for this purpose? For instance, the collector's tailsamplingprocessor reuses ottl for its ottl_conditions. If the same syntax, functions, data model, and scoping semconv could be applied at both the language-level and the collector it would be a big win for coherency.

For clarity, I'm raising a possible future idea, not trying to derail the current work here which looks good!

Copy link
Contributor

@breedx-splk breedx-splk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dol thank you for your patience and understanding with this PR.

We definitely think there is room for this sampler to exist, and many of us really appreciate the flexibility that the CEL brings here. We also think that @johnbley 's idea about OTTL is also interesting, but no implementation of OTTL yet exists in java...but maybe one day! 🤞🏻

So as @trask suggested above, we did indeed talk about this in the SIG meeting this week. The conclusion was that we think it would be best to introduce this as a new contrib module instead of adding it to the existing samplers module. Doing that allows the existing stable java instrumentation agent to continue depending on the existing samplers module, without introducing new dependencies or opening up any potential expression exploit/vulnerability that might be dreamed up.

Based on this, I'm going to request changes on this PR in hopes that you can either adjust this one or make a new one that creates a new gradle module. I hope you understand and can work with us to get this great work merged.

Thanks again for the contribution and looking forward to seeing it land as a new module here.

@dol
Copy link
Contributor Author

dol commented Oct 4, 2025

@breedx-splk Thanks for the clarification on how to go forward. I'll go ahead and perform the requested changes to outsource the CEL sampler to it's own gradle module.

I was not aware that the OTEL collector already has a similar approeach in complex filtering with it's own DSL, the OpenTelemetry Transformation Language (OTTL).
I agree that I would make sense to have the one common syntax like the OTTL for processing and sampling.

Adds new CelBasedSampler to the OpenTelemetry Java SDK, enabling advanced
sampling rules using the Common Expression Language (CEL).
@dol dol force-pushed the feature/cel-sampler branch from c930856 to 55c88a8 Compare October 9, 2025 21:52
@dol
Copy link
Contributor Author

dol commented Oct 9, 2025

@breedx-splk It's now it's own dedicated gradle module. I messed up the rebase and it included more commits then expected. As a consequence the github-action bot included a lot more reviewers. 🙇 Sorry for spamming additional code owners.

Copy link
Contributor

@breedx-splk breedx-splk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nearly there! I had a few small comments, and I do think it's helpful to expand the component owners list. Please also add this project to the topmost README.md in the repo.

Thanks again!

@breedx-splk breedx-splk dismissed their stale review October 15, 2025 16:54

Already doing follow-up reviews. Getting close to approval!

- Added @trask, @jack-berg and @breedx-splk as additional code owners
- Added the project to the topmost README.md list of Provided Libraries
- Made the list of expressions passed to the sampler immutable
- Calculate the string representation of an expression on demand and not
  set it as a class private property
@dol dol force-pushed the feature/cel-sampler branch from a5748f3 to c863fe6 Compare October 16, 2025 20:24
@breedx-splk breedx-splk added this pull request to the merge queue Oct 16, 2025
Merged via the queue into open-telemetry:main with commit eed644f Oct 16, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants