Skip to content

Conversation

@ryanhoangt
Copy link
Collaborator

@ryanhoangt ryanhoangt commented Nov 3, 2025

Fix OpenHands/OpenHands#11491


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Base Image Docs / Tags
golang golang:1.21-bookworm Link
java eclipse-temurin:17-jdk Link
python nikolaik/python-nodejs:python3.12-nodejs22 Link

Pull (multi-arch manifest)

docker pull ghcr.io/openhands/agent-server:b934564-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-b934564-python \
  ghcr.io/openhands/agent-server:b934564-python

All tags pushed for this build

ghcr.io/openhands/agent-server:b934564-golang
ghcr.io/openhands/agent-server:v1.0.0a5_golang_tag_1.21-bookworm_binary
ghcr.io/openhands/agent-server:b934564-java
ghcr.io/openhands/agent-server:v1.0.0a5_eclipse-temurin_tag_17-jdk_binary
ghcr.io/openhands/agent-server:b934564-python
ghcr.io/openhands/agent-server:v1.0.0a5_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary

The b934564 tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.

@ryanhoangt ryanhoangt requested a review from xingyaoww November 3, 2025 10:50
@github-actions
Copy link
Contributor

github-actions bot commented Nov 3, 2025

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/llm/mixins
   fn_call_converter.py4303899%74–76, 365–380, 382–383, 385, 394–395, 397–400, 402–403, 405–406, 408–409, 411–414, 416–417, 419–420, 422–423, 428, 435, 437, 458–465, 467–471, 475–481, 483–488, 495–496, 498–501, 503–518, 525–526, 528–529, 532–539, 542–546, 548–550, 553, 556–566, 568–569, 571–580, 584–587, 591–596, 598, 602–610, 612–615, 617–620, 622, 624–626, 628, 632–634, 636, 638–639, 648, 650–651, 655–659, 663–668, 670, 672, 677, 680, 682–683, 686, 689, 691–695, 701, 723, 728, 738–741, 747–750, 755–761, 763, 765, 769, 772, 774–775, 777–780, 785, 787, 791–792, 796, 802, 804, 807, 815, 817–819, 821–823, 825–827, 833–836, 839–840, 847–852, 855–859, 864, 867, 871, 875, 880–881, 884–886, 889, 894–896, 898–899, 915, 927–929, 933–934, 936–939, 941–942, 944–946, 948, 951, 953, 955–957, 959–960, 963–964, 967–969, 971–974, 977, 981, 988–989, 992–993, 1007, 1013–1015, 1018–1019, 1023–1024, 1030–1031, 1034, 1046, 1049–1056, 1060–1061, 1066–1067, 1072, 1078–1081, 1091–1092, 1097, 1103–1104, 1109–1110, 1117, 1120–1122, 1125–1126, 1128, 1134, 1139, 1142, 1146, 1154, 1156–1157, 1160–1162, 1164–1165, 1171–1173, 1175–1176, 1178, 1180, 1184, 1186, 1191, 1193–1194, 1197
TOTAL11180506754% 

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try to migrate stuff from this PR? OpenHands/OpenHands#11497

@ryanhoangt
Copy link
Collaborator Author

Ah I didn't see that PR, I've ported the ICL examples into this PR, the schema formatting of this PR is more comprehensive so I keep it as-is. Tested with qwen/qwen3-235b-a22b-2507 with native_tool_calling=False and it seems to work correctly now!

Screenshot 2025-11-03 at 21 45 51

@ryanhoangt ryanhoangt requested a review from xingyaoww November 3, 2025 14:52
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands do a /codereview-roasted

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! xingyaoww can track my progress at all-hands.dev

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

I've completed a roasted code review in the style of Linus Torvalds for PR #1001: "Handle nested field schema formatting for tool description".

What I Did

I analyzed the PR that fixes issue #11491, where the task_tracker tool was producing "Untitled task" entries because LLMs didn't understand the nested schema structure of the task_list parameter.

Code Review Findings

🔴 Needs improvement - The PR solves a real problem but has implementation issues:

CRITICAL BUG FOUND:

  • Lines 344-356: The task_tracker example JSON contains trailing commas, which is invalid JSON per RFC 8259. This will cause LLMs to learn incorrect JSON syntax, leading to parsing failures and the same "Untitled task" problem the PR is trying to fix.

Major Complexity Issues:

  1. _format_schema_detail function (lines 521-598): 78 lines doing too much - handling unions, arrays, objects, and additional properties all in one function with up to 3 levels of nesting
  2. Magic numbers everywhere: Hardcoded indent calculations (indent + 2, indent + 4, indent=6) scattered throughout with no clear constants
  3. Inconsistent description handling: Union options omit description suffix when missing, but object properties show "No description provided"

What's Actually Good:

  • ✅ Solves a real, user-facing problem
  • ✅ Tests are comprehensive and well-structured
  • ✅ Core recursive algorithm is fundamentally sound
  • ✅ Pragmatic solution with in-context learning examples

Verdict

Needs rework - Must fix the trailing comma bug in the task_tracker example JSON first, then consider refactoring for clarity.

After fixes: The code is acceptable and addresses the real problem of LLMs not understanding nested schemas, but the implementation has unnecessary complexity that should be cleaned up.

No code changes were made - this was a review-only task as requested by the /codereview-roasted command.

View full conversation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryanhoangt could you address this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: convert_tools_to_description does not parse nested parameters of tools

3 participants