Fix github issue 2441 selected call #2451

aaronvg · 2025-09-13T23:43:20Z

Pull Request Template

Thanks for taking the time to fill out this pull request!

Issue Reference

Please link to any related issues

This PR fixes/closes [bug] Collector: when using fallback strategy with 2 clients , and if one of the call is failed, the selected_call will be populated with failed call. #2441

Changes

Please describe the changes proposed in this pull request

This PR addresses an issue where selected_call could incorrectly mark a failed LLM response as selected. The logic for determining the selected call in engine/baml-runtime/src/tracingv2/storage/storage.rs now explicitly checks that the LLM response does not contain an error_message. A new unit test has been added to ensure that if multiple calls occur, a successful call is correctly prioritized as the selected_call over a failed one.

Testing

Please describe how you tested these changes

Unit tests added/updated (See test_selected_call_prefers_success_over_failure)
Manual testing performed
Tested in [environment]

Screenshots

If applicable, add screenshots to help explain your changes

PR Checklist

Please ensure you've completed these items

I have read and followed the contributing guidelines
My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings

Additional Notes

Add any other context about the PR here

Tests were not executed in the environment due to missing cargo.
To run tests locally:

cargo test -p baml-runtime -q

Slack Thread

Co-authored-by: aaron <[email protected]>

cursor · 2025-09-13T23:43:20Z

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
_{Learn more about Cursor Agents}

vercel · 2025-09-13T23:43:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Preview	Comments	Updated (UTC)
promptfiddle	Skipped			Sep 13, 2025 11:52pm

github-actions · 2025-09-13T23:45:13Z

🌿 Preview your docs: https://boundary-preview-196934ba-c8b2-4a28-bf01-3be0cd104a2d.docs.buildwithfern.com

github-actions · 2025-09-13T23:45:46Z

https://sage-backend-nm31bcerx-baml.vercel.app

aaronvg · 2025-09-13T23:51:25Z

@cursoragent you should choose the last call in the stack using the timestamp, unless it failed then try to look for the one that passed. The http_request_id is ULID with a timestamp. see define_id!(..)

cursor · 2025-09-13T23:51:31Z

I refined the LLM call selection logic in build_function_log to ensure only one call is marked 'selected' per function log, prioritizing the latest successful call for improved trace clarity.

I modified build_function_log in engine/baml-runtime/src/tracingv2/storage/storage.rs. I now collect all LLM call candidates, then apply a new algorithm: select the latest successful call; if the latest failed, select the most recent successful call.
I added test_selected_call_chooses_earlier_success_if_last_failed to validate this new selection behavior.

_{Learn more about Cursor Agents}

Co-authored-by: aaron <[email protected]>

github-actions · 2025-09-13T23:54:17Z

🌿 Preview your docs: https://boundary-preview-b5d000ae-3f8e-4b64-92bb-f6d93ed6254c.docs.buildwithfern.com

github-actions · 2025-09-14T00:37:22Z

https://sage-backend-fx1y7rjar-baml.vercel.app

…order (#2692) ### Issue Reference This PR fixes #2451 ### Summary Updated LLM call selection logic to correctly pick the earliest successful call by lexicographic `HttpRequestId` ordering. This ensures deterministic selection when multiple successful calls exist. ### Changes This PR addresses an issue where the code prioritized completion time instead of lexicographically earliest ulid uuid. 1. Filter to keep only candidates where `is_success = true` 2. Sort these successful candidates by http request id, ulid uuid, lexicographically) 3. Choose the first candidate from this sorted list ### Details - Filters candidates to only successful calls - Sorts by `request_id.to_string()` lexicographically (since `HttpRequestId` lacks `Ord`) - Picks the first entry in the sorted list - Verified via `cargo test -p baml-runtime -q`  --- > [!NOTE] > Refactors LLM call selection to pick the earliest successful call by lexicographic HttpRequestId and marks only that call as selected, with new tests covering success-vs-failure and ordering. > > - **Tracing/FunctionLog (storage.rs)** > - Refactor LLM call assembly: introduce `CallCandidate` to gather data for each `request_id`. > - Selection logic: filter successful candidates, sort by `request_id.to_string()` lexicographically, select the first; mark only that call as `selected` across `Basic` and `Stream` kinds. > - **Tests** > - Add `test_selected_call_prefers_success_over_failure` and `test_selected_call_chooses_earlier_success_if_last_failed` to verify selection behavior. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 9a3143e. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>  --------- Co-authored-by: Cursor Agent <[email protected]> Co-authored-by: aaron <[email protected]>

Fix: Correctly mark successful LLM calls as selected

caa7804

Co-authored-by: aaron <[email protected]>

aaronvg temporarily deployed to boundary-tools-dev September 13, 2025 23:43 — with GitHub Actions Inactive

Refactor: Improve LLM call selection logic and add test

fb0b91c

Co-authored-by: aaron <[email protected]>

cursor bot temporarily deployed to boundary-tools-dev September 13, 2025 23:52 Inactive

vercel bot temporarily deployed to Preview September 13, 2025 23:52 Inactive

aaronvg marked this pull request as ready for review September 13, 2025 23:56

shawn-mcdonald-dev mentioned this pull request Nov 3, 2025

fix: select earliest successful LLM call by lexicographic request_id order #2692

Merged

aaronvg closed this in #2692 Nov 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix github issue 2441 selected call #2451

Fix github issue 2441 selected call #2451

Uh oh!

aaronvg commented Sep 13, 2025

Uh oh!

cursor bot commented Sep 13, 2025

Uh oh!

vercel bot commented Sep 13, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 13, 2025

Uh oh!

github-actions bot commented Sep 13, 2025

Uh oh!

aaronvg commented Sep 13, 2025

Uh oh!

cursor bot commented Sep 13, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 13, 2025

Uh oh!

github-actions bot commented Sep 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix github issue 2441 selected call #2451

Fix github issue 2441 selected call #2451

Uh oh!

Conversation

aaronvg commented Sep 13, 2025

Pull Request Template

Issue Reference

Changes

Testing

Screenshots

PR Checklist

Additional Notes

Uh oh!

cursor bot commented Sep 13, 2025

Uh oh!

vercel bot commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 13, 2025

Uh oh!

github-actions bot commented Sep 13, 2025

Uh oh!

aaronvg commented Sep 13, 2025

Uh oh!

cursor bot commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 13, 2025

Uh oh!

github-actions bot commented Sep 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel bot commented Sep 13, 2025 •

edited

Loading

cursor bot commented Sep 13, 2025 •

edited

Loading