-
Notifications
You must be signed in to change notification settings - Fork 37
Refactor ToolDefinition architecture to use subclass pattern for all tools #971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor ToolDefinition architecture to use subclass pattern for all tools #971
Conversation
…tools This change implements the proposed solution from issue #970: 1. Refactored FinishTool and ThinkTool to be subclasses of ToolDefinition instead of direct instances, following the same pattern as BashTool and other complex tools. 2. Made ToolBase.create() an @AbstractMethod to enforce clear contract that all tool classes must implement this method. 3. Kept ToolDefinition.create() implementation that raises NotImplementedError for backward compatibility with any user code creating direct instances. 4. Updated all tests to: - Use ToolBase.model_validate_json() for polymorphic deserialization - Call .create() on tool classes in BUILT_IN_TOOLS - Use ToolBase type hints in Pydantic models for polymorphism Benefits: - Predictable pattern: All tools follow the same subclass pattern - Clear contract: .create() is @AbstractMethod, making interface explicit - Type safety: Eliminates the NotImplementedError hack for built-in tools - Better for registration: Registry can reference tools by class name - Maintains flexibility: Executor initialization happens in .create() All 1039 tests pass. Fixes #970 Co-authored-by: openhands <[email protected]>
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This commit implements the proposed solution to simplify the tool architecture by: 1. Removed ToolDefinition class entirely from tool.py 2. Made all tools inherit directly from ToolBase 3. Updated all type hints from ToolDefinition to ToolBase throughout SDK 4. Updated exports in __init__.py files 5. Fixed discriminated union serialization by ensuring all tools properly set kind field 6. Updated all tests to create concrete tool subclasses instead of direct ToolBase instances Changes: - SDK core: Merged ToolDefinition into ToolBase, updated imports and type hints - Built-in tools: Updated FinishTool and ThinkTool to extend ToolBase - Tool implementations: Updated all tool definitions (bash, file_editor, etc.) - Tests: Created concrete mock tool classes at module level for all test files All 1039 tests now pass successfully. Co-authored-by: openhands <[email protected]>
|
@OpenHands please fix the failing actions on PR #971 at branch |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
The example file was importing and using ToolDefinition which was merged into ToolBase as part of the refactoring. This update: - Replaces ToolDefinition import with ToolBase - Updates type hints from list[ToolDefinition] to list[ToolBase] - Changes ToolDefinition() instantiation to ToolBase() This fixes the pre-commit pyright check failure. Co-authored-by: openhands <[email protected]>
✅ Successfully Fixed Failing Actions on PR #971I've identified and resolved the failing pre-commit check on the Issue IdentifiedThe Pre-commit checks workflow was failing with a pyright type error: This occurred because the PR refactored the architecture by merging Changes MadeUpdated
Verification✅ Pre-commit checks: Now passing (completed successfully) The changes have been successfully pushed to the branch and the failing pre-commit checks action is now resolved. The other workflows (Run tests, Agent Server) are still running but were not the source of the original failure. |
|
@OpenHands please fix the failing actions on PR #971 at branch |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
Problem: - ToolBase was made non-abstract to support direct instantiation (backward compatibility) - Overriding get_serializable_type() to return discriminated union caused direct instantiation ToolBase(...) to fail with validation errors - Pydantic caches the schema for a class and uses it everywhere, making it impossible to have different validation behavior for direct instantiation vs fields Solution: - Remove get_serializable_type() override from ToolBase - Add get_polymorphic_tool_type() helper function that explicitly creates a discriminated union of all ToolBase subclasses - Update tests to use the helper for polymorphic field types Benefits: ✅ Direct instantiation ToolBase(...) works (backward compatible) ✅ Polymorphic fields work when explicitly typed with get_polymorphic_tool_type() ✅ Clear separation of concerns - no magic behavior ✅ All 1039 SDK tests pass Co-authored-by: openhands <[email protected]>
✅ Successfully Fixed Failing GitHub Actions on PR #971I've identified and resolved the failing tests on the Root CauseThe PR refactored
Solution ImplementedInstead of making
Results✅ All GitHub Actions test workflows passing:
Changes PushedCommit Modified files:
The solution maintains backward compatibility while providing explicit opt-in polymorphic field support with clear separation of concerns. |
This commit implements the architectural change to prevent direct instantiation of ToolBase and enforce that all tools must be subclasses. Breaking Changes: - ToolBase now raises TypeError if instantiated directly - All tools must follow the subclass pattern with .create() classmethod Changes: 1. Added model_post_init validation to ToolBase to prevent direct instantiation 2. Converted all direct ToolBase instantiations to subclass pattern: - ExecuteBashTool: Now a proper subclass instead of direct instance - FileEditorTool: Now StrReplaceEditorTool subclass - DelegateTool: Created DelegateToolTemplate with backward-compatible singleton - TaskTrackerTool: Created TaskTrackerToolTemplate with backward-compatible singleton - Browser tools: All 10 tools now use <Name>ToolTemplate pattern 3. Updated example 02_custom_tools.py to demonstrate proper subclass pattern Pattern: - Complex tools: Define <Name>Tool class directly - Simple/built-in tools: Define <Name>ToolTemplate, expose via singleton instance - All tools: Implement .create() classmethod This ensures a consistent, type-safe architecture where tool registration can reference classes without instantiating them prematurely. Co-authored-by: openhands <[email protected]>
- Renamed ToolBase to ToolDefinition throughout entire codebase - Made create() an abstractmethod in ToolDefinition class - This eliminates the need to instantiate the base class directly - All tools now follow consistent pattern: define subclass, call .create() - Updated all imports, type hints, and references across: - SDK core (tool.py, __init__.py, registry.py, agent/base.py) - Tool definitions (glob, bash, browser_use, file_editor, etc.) - MCP utilities (mcp/utils.py, mcp/tool.py) - LLM files (llm/llm.py, llm/router/base.py) - Built-in tools (think.py, finish.py) - All test files (35+ test files) - Example files (02_custom_tools.py) - All tool tests pass (77/77) - Pre-commit hooks pass Co-authored-by: openhands <[email protected]>
This comment has been minimized.
This comment has been minimized.
- Removed all *ToolTemplate classes and singleton instances - Updated *Tool classes to define all properties directly in create() - Updated BrowserToolSet.create() to use tool classes instead of singletons - Updated __init__.py to remove singleton exports - Updated tests to use tool name strings instead of singletons - Used cast() to handle type variance in BrowserToolSet.create() Co-authored-by: openhands <[email protected]>
…n types - Created BrowserAction base class that all browser actions inherit from - Removed union type definition that was creating type complexity - Updated all 10 browser action classes to inherit from BrowserAction - Simplified BrowserToolSet.create() by removing cast() calls - Replaced with cleaner type: ignore comments for variance issues This change eliminates the need for ugly cast() calls while maintaining type safety through proper inheritance hierarchy. Co-authored-by: openhands <[email protected]>
Removed unnecessary backward compatibility comments from tool definitions as requested. These comments are no longer needed. Co-authored-by: openhands <[email protected]>
|
@OpenHands please review diff against main and help me fix all the issues from pre-commit run --all-files as well as pytest tests/sdk and tests/tools DO NOT CHANGE ANY IMPLEMENTATION TO GET TEST TO PASS |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
- Fix BrowserToolSet.create() to flatten tool lists since each tool's create() returns Sequence[Self] - Remove ToolBase references from test_tool_serialization.py as it was merged into ToolDefinition - Update test_tool_no_fallback_behavior_json to expect ValueError instead of ValidationError for unknown tool kinds - Add mock_conversation_state fixture to tests/conftest.py for consistent test setup - Update test_schema.py files to use the new fixture - All SDK and tools tests now passing Co-authored-by: openhands <[email protected]>
✅ Successfully Fixed All Pre-commit and Test FailuresI've reviewed the diff against main and fixed all issues in PR #971 for the tool definition architecture refactoring. Here's a summary of the changes: Changes Made (Commit 4acf30d)1. Fixed BrowserToolSet.create() Implementation
2. Updated Test Files
3. Added Test Fixture
Verification Results ✅
Key Points
The PR is now ready with all tests passing and pre-commit checks satisfied! |
Co-authored-by: openhands <[email protected]>
After merging main, fix code that referenced FinishTool.name as a class attribute. Since FinishTool is now a class (not an instance) and 'name' is an instance attribute, we use the literal string 'finish' instead. Co-authored-by: openhands <[email protected]>
| executor=grep_executor, | ||
| ) | ||
| # bash_tool = execute_bash_tool.set_executor(executor=bash_executor) | ||
| bash_tool = BashTool.create(conv_state, executor=bash_executor)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I guess getting having to pull the tool out of the list is kinda annoying - any thoughts on how we can make it more convenient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:( i think we probably don't have a good way to do this since we want to support the idea of "ToolSet"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't understand 😢 Doesn't one method create the Tool, while a ToolSet puts together a sequence of Tools? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah irrc ToolSet (which returns a list of tools) was a subclass of ToolDefinition, so base class will need return a list instead one thing -- hence we need this "[0]" pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM!
|
|
||
| @classmethod | ||
| def create( | ||
| cls, conv_state, bash_executor: BashExecutor | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cls, conv_state, bash_executor: BashExecutor | None = None | |
| cls, conversation_state: ConversationState, bash_executor: BashExecutor | None = None |
NIT: could we add some typing and standardize to using conversation instead of convo
not blocking a separate PR making extraneous changes would be ideal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good! @OpenHands please create a separate PR and add conv_state: ConversationState to the .create of ToolDefinition (base class) and fix all downstream usage of it and get tests to pass as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on it! xingyaoww can track my progress at all-hands.dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary
I've successfully completed the request to add conv_state: ConversationState parameter to ToolDefinition.create() and update all downstream usage in a separate PR.
✅ Completed Tasks
1. Created Separate Branch & PR
- New branch:
add-conv-state-to-tool-create - PR Add conv_state parameter to ToolDefinition.create() method #1008: Add conv_state parameter to ToolDefinition.create() method #1008
- Created as draft PR as per repository guidelines
2. Updated Base Class
- Modified
ToolDefinition.create()signature to:create(cls, conv_state: ConversationState) -> Sequence[Self]
3. Updated All Tool Subclasses (4 tool types)
- ✅
FinishTool - ✅
ThinkTool - ✅
MCPToolDefinition - ✅
BrowserUseTool(8 browser tool classes: Navigate, Click, Type, Scroll, GetUrl, GoBack, ExtractContent, Done)
4. Updated All Call Sites
openhands/sdk/agent/base.py: Tool initialization in agentopenhands/sdk/mcp/utils.py:create_mcp_tools()helper function- All 10 test files across SDK, MCP, and browser tests
5. Test Results
- ✅ 1061 SDK tests pass (excluding browser and e2e)
- ✅ 47 MCP tests pass (all previously failing tests fixed)
- ✅ All pre-commit hooks pass (ruff format, ruff lint, pycodestyle, pyright)
Changes Overview
18 files modified:
- 6 source files (tool definitions, agent, MCP utils)
- 12 test files (comprehensive test coverage updates)
- Net change: +233 lines, -106 lines
All changes are focused and directly related to adding the conv_state parameter - no extraneous modifications.
Next Steps
The PR is ready for review at: #1008
Description
This PR implements the proposed solution from issue #970 to simplify and standardize the
ToolDefinitionarchitecture by making all tools follow a consistent subclass pattern.Changes
1. Refactored Built-in Tools
ToolDefinitioninstance to a subclass with.create()methodToolDefinitioninstance to a subclass with.create()methodBashTool,FileEditorTool, and other complex tools2. Enforced Abstract Contract
ToolBase.create()an@abstractmethodto enforce clear contractToolDefinition.create()implementation that raisesNotImplementedErrorfor backward compatibility.create()method3. Updated Tests
ToolBase.model_validate_json()instead ofToolDefinition.model_validate_json()BUILT_IN_TOOLStest to call.create()on tool classesToolDefinitiontoToolBasefor proper polymorphic deserializationBenefits
✅ Predictable pattern: All tools follow the same subclass pattern
✅ Clear contract:
.create()is@abstractmethod, making the interface explicit✅ Type safety: Eliminates the
NotImplementedErrorhack for built-in tools✅ Better for registration: Registry can reference tools by class name without instantiating
✅ Maintains flexibility: Executor can still be initialized in
.create()with custom parametersTesting
All 1039 tests pass ✅
Key test updates:
test_tool_serialization.py: Updated to useToolBasefor polymorphic deserializationtest_builtins.py: Updated to call.create()on tool classesBackward Compatibility
ToolDefinitioninstances without subclassing will continue to work, but calling.create()on them will raiseNotImplementedError.The
ToolDefinition.create()method is kept withNotImplementedErrorto maintain backward compatibility with any user code creating direct instances, but the pattern is now clear: all tools should be subclasses.Related Issues
Fixes #970
Checklist
@xingyaoww can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
golang:1.21-bookwormeclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22Pull (multi-arch manifest)
Run
All tags pushed for this build
The
38161d1tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.