Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
13609c2
refactor: standardize Observation base class with error/output and st…
openhands-agent Oct 27, 2025
6520808
refactor: standardize Observation subclasses and maintain backward-co…
openhands-agent Oct 27, 2025
a349a7b
test: align tests with standardized Observation fields
openhands-agent Oct 27, 2025
3ff4cb9
test(execute_bash): update assertions to use has_error per standardiz…
openhands-agent Oct 27, 2025
4c1d809
refactor(delegate): standardize DelegateObservation to use output and…
openhands-agent Oct 27, 2025
7ce7a9b
Merge branch 'main' into openhands/standardize-observation-base
simonrosenberg Oct 28, 2025
d0ca50a
refactor: improve observation consistency
openhands-agent Oct 28, 2025
b7efb0d
refactor: simplify MCPToolObservation to use base to_llm_content
openhands-agent Oct 28, 2025
a93a9e2
refactor: use error field for multiple command error in ExecuteBash
openhands-agent Oct 28, 2025
bded49d
refactor: remove redundant to_llm_content from FileEditorObservation
openhands-agent Oct 28, 2025
262e2a5
refactor: revert TaskTrackerObservation changes to preserve original …
openhands-agent Oct 28, 2025
f0aaea0
refactor: add optional command field to base Observation and update t…
openhands-agent Oct 28, 2025
11d5495
refactor: clean up observation subclasses for consistency
openhands-agent Oct 28, 2025
b52ce10
update
simonrosenberg Oct 28, 2025
2c218cf
fix: populate output field in ThinkObservation and FinishObservation
openhands-agent Oct 28, 2025
b4a29fc
refactor: improve Observation consistency and error handling
openhands-agent Oct 28, 2025
5c422be
refactor: simplify MCPToolObservation to use base output field
openhands-agent Oct 28, 2025
c2cb27b
Merge branch 'main' into openhands/standardize-observation-base
simonrosenberg Oct 30, 2025
adc5da0
Merge branch 'main' into openhands/standardize-observation-base
simonrosenberg Oct 31, 2025
09cee6b
update tool base schema
simonrosenberg Oct 31, 2025
157140c
refactor: update all Observation subclasses to use standardized base …
openhands-agent Oct 31, 2025
651f957
fix: resolve type errors in examples, planning_file_editor, and test_…
openhands-agent Oct 31, 2025
2412728
refactor: rename _format_error to format_error and use it consistently
openhands-agent Oct 31, 2025
da0e0be
refactor: preserve truncation in ExecuteBashObservation.to_llm_content
openhands-agent Oct 31, 2025
e4f3efd
refactor: apply metadata prefix/suffix to error output
openhands-agent Oct 31, 2025
353b3e9
update doc
simonrosenberg Oct 31, 2025
85773de
refactor: simplify ExecuteBashObservation to use only output field fr…
openhands-agent Oct 31, 2025
01479d7
fix: add raw_output property to ExecuteBashObservation and update tests
openhands-agent Oct 31, 2025
6285b85
refactor: remove command property from base Observation class
openhands-agent Oct 31, 2025
6470d0a
fix: update planning_file_editor to use command instead of cmd
openhands-agent Oct 31, 2025
2d12008
fix: use command instead of cmd in bash reset with command
openhands-agent Oct 31, 2025
0b4b706
Merge branch 'main' into openhands/standardize-observation-base
simonrosenberg Oct 31, 2025
b252b15
Add ergonomic helpers for standardized 'output' field
openhands-agent Nov 1, 2025
21d2d56
Revert "Add ergonomic helpers for standardized 'output' field"
simonrosenberg Nov 1, 2025
9b1868a
feat: support str and list types for Observation.output
openhands-agent Nov 2, 2025
c474405
refactor: update remaining tools to use str output instead of list
openhands-agent Nov 2, 2025
d2dba7a
Merge branch 'main' into openhands/standardize-observation-base
simonrosenberg Nov 3, 2025
23bb43d
fix: update delegation tests to work with standardized Observation base
openhands-agent Nov 3, 2025
5df3c52
Merge branch 'main' into openhands/standardize-observation-base
openhands-agent Nov 4, 2025
12b4d19
Merge branch 'main' into openhands/standardize-observation-base
xingyaoww Nov 4, 2025
b71b350
Merge branch 'main' into openhands/standardize-observation-base
simonrosenberg Nov 4, 2025
0e712a9
refactor: update Observation base class to use 'content' and 'is_erro…
openhands-agent Nov 4, 2025
1ffcf6b
test: update all tests to use new observation schema with content and…
openhands-agent Nov 4, 2025
0b67da3
test: fix additional test failures after observation schema refactor
openhands-agent Nov 4, 2025
d609bf1
test: fix file_editor test helper to use content instead of output
openhands-agent Nov 4, 2025
2351225
test: update execute_bash and file_editor tests to use new observatio…
openhands-agent Nov 4, 2025
b6ad774
test: update browser_use and stuck_detector tests to use new observat…
openhands-agent Nov 4, 2025
dfc918f
refactor: simplify MCPToolObservation creation with single return path
openhands-agent Nov 4, 2025
38841be
refactor: simplify MCPToolObservation.visualize using assert for type…
openhands-agent Nov 4, 2025
7b5ff64
feat: add configurable error_message_header to Observation base class
openhands-agent Nov 4, 2025
446c53a
Merge branch 'main' into openhands/standardize-observation-base
openhands-agent Nov 4, 2025
e17c515
fix: resolve merge conflicts and simplify observation implementations
openhands-agent Nov 4, 2025
f224f2a
Merge main into openhands/standardize-observation-base
openhands-agent Nov 4, 2025
a4cfa37
fix: handle both str and list types in ExecuteBashObservation content
openhands-agent Nov 4, 2025
d63ed09
fix: remove incorrect list content in ExecuteBashObservation test cases
openhands-agent Nov 4, 2025
ccda8ea
fix: correct BrowserObservation content from list to str in tests
openhands-agent Nov 4, 2025
c0750b0
fix: update MCPToolObservation error handling to use list content and…
openhands-agent Nov 4, 2025
0cf66ad
refactor: change Observation.content to list[TextContent | ImageConte…
openhands-agent Nov 4, 2025
2645aab
fix: update tests and examples to use list-based observation.content
openhands-agent Nov 5, 2025
249f9fe
fix: update task_tracker and tests to use list-based observation.content
openhands-agent Nov 5, 2025
18fced3
Merge branch 'openhands/standardize-observation-base' of github.com:O…
simonrosenberg Nov 5, 2025
851213b
refactor: simplify Observation API by replacing get_text_safe with ge…
openhands-agent Nov 5, 2025
45bff22
content_obj -> content
simonrosenberg Nov 5, 2025
5b4cffd
refactor: standardize visualize method error handling across observat…
openhands-agent Nov 5, 2025
5ba09ad
refactor: apply standardized visualize pattern to MCPToolObservation
openhands-agent Nov 5, 2025
b7ca7a2
improve readibility
simonrosenberg Nov 5, 2025
98cce06
update doc
simonrosenberg Nov 5, 2025
8d65a9b
update description
simonrosenberg Nov 5, 2025
f24ab0a
refactor: use from_text for ExecuteBashObservation initialization
openhands-agent Nov 5, 2025
49e8832
fix: add required command parameter to ExecuteBashObservation tests
openhands-agent Nov 5, 2025
fd18203
add is_error = True
simonrosenberg Nov 5, 2025
84da675
refactor: simplify content extraction in execute_bash impl
openhands-agent Nov 5, 2025
19e5a7b
refactor: use get_text() method for cleaner content extraction
openhands-agent Nov 5, 2025
8c3753c
fix: correctly set content field as list of TextContent in ExecuteBas…
openhands-agent Nov 5, 2025
e05c35d
refactor: make ERROR_MESSAGE_HEADER a class variable and rename get_t…
openhands-agent Nov 5, 2025
cd8807b
Merge branch 'main' into openhands/standardize-observation-base
xingyaoww Nov 5, 2025
e62771b
fix: update test to handle observation.content as list
openhands-agent Nov 5, 2025
38256bf
Apply suggestion from @xingyaoww
xingyaoww Nov 5, 2025
85b1f15
Apply suggestion from @xingyaoww
xingyaoww Nov 5, 2025
2008c4b
Merge branch 'main' into openhands/standardize-observation-base
simonrosenberg Nov 5, 2025
a19e454
Revert "Apply suggestion from @xingyaoww"
xingyaoww Nov 5, 2025
4a564b6
fix error msg
simonrosenberg Nov 5, 2025
4293eb1
use ERROR_MESSAGE_HEADER
simonrosenberg Nov 5, 2025
d1684ec
Merge branch 'main' into openhands/standardize-observation-base
xingyaoww Nov 5, 2025
2fab1dd
simplify
xingyaoww Nov 5, 2025
1d16b6f
fix test
xingyaoww Nov 5, 2025
2bf41e2
simplify
xingyaoww Nov 5, 2025
ece030e
simplify test
xingyaoww Nov 5, 2025
c648e23
clean up get_output_text
xingyaoww Nov 5, 2025
1d0e7f4
fix test
xingyaoww Nov 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions examples/01_standalone_sdk/02_custom_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,10 @@ def __call__(self, action: GrepAction, conversation=None) -> GrepObservation: #
files: set[str] = set()

# grep returns exit code 1 when no matches; treat as empty
if result.output.strip():
for line in result.output.strip().splitlines():
output_text = result.text

if output_text.strip():
for line in output_text.strip().splitlines():
matches.append(line)
# Expect "path:line:content" — take the file part before first ":"
file_path = line.split(":", 1)[0]
Expand Down
50 changes: 20 additions & 30 deletions openhands-sdk/openhands/sdk/mcp/definition.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
"""MCPTool definition and implementation."""

import json
from collections.abc import Sequence
from typing import Any

import mcp.types
Expand Down Expand Up @@ -51,28 +50,23 @@ def to_mcp_arguments(self) -> dict:
class MCPToolObservation(Observation):
"""Observation from MCP tool execution."""

content: list[TextContent | ImageContent] = Field(
default_factory=list,
description="Content returned from the MCP tool converted "
"to LLM Ready TextContent or ImageContent",
)
is_error: bool = Field(
default=False, description="Whether the call resulted in an error"
)
tool_name: str = Field(description="Name of the tool that was called")

@classmethod
def from_call_tool_result(
cls, tool_name: str, result: mcp.types.CallToolResult
) -> "MCPToolObservation":
"""Create an MCPToolObservation from a CallToolResult."""
content: list[mcp.types.ContentBlock] = result.content
convrted_content = []
for block in content:

native_content: list[mcp.types.ContentBlock] = result.content
content: list[TextContent | ImageContent] = [
TextContent(text=f"[Tool '{tool_name}' executed.]")
]
for block in native_content:
if isinstance(block, mcp.types.TextContent):
convrted_content.append(TextContent(text=block.text))
content.append(TextContent(text=block.text))
elif isinstance(block, mcp.types.ImageContent):
convrted_content.append(
content.append(
ImageContent(
image_urls=[f"data:{block.mimeType};base64,{block.data}"],
)
Expand All @@ -81,36 +75,32 @@ def from_call_tool_result(
logger.warning(
f"Unsupported MCP content block type: {type(block)}. Ignoring."
)

return cls(
content=convrted_content,
content=content,
is_error=result.isError,
tool_name=tool_name,
)

@property
def to_llm_content(self) -> Sequence[TextContent | ImageContent]:
"""Format the observation for agent display."""
initial_message = f"[Tool '{self.tool_name}' executed.]\n"
if self.is_error:
initial_message += "[An error occurred during execution.]\n"
return [TextContent(text=initial_message)] + self.content

@property
def visualize(self) -> Text:
"""Return Rich Text representation of this observation."""
content = Text()
content.append(f"[MCP Tool '{self.tool_name}' Observation]\n", style="bold")
text = Text()

if self.is_error:
content.append("[Error during execution]\n", style="bold red")
text.append("❌ ", style="red bold")
text.append(self.ERROR_MESSAGE_HEADER, style="bold red")

text.append(f"[MCP Tool '{self.tool_name}' Observation]\n", style="bold")
for block in self.content:
if isinstance(block, TextContent):
# try to see if block.text is a JSON
try:
parsed = json.loads(block.text)
content.append(display_dict(parsed))
text.append(display_dict(parsed))
continue
except (json.JSONDecodeError, TypeError):
content.append(block.text + "\n")
text.append(block.text + "\n")
elif isinstance(block, ImageContent):
content.append(f"[Image with {len(block.image_urls)} URLs]\n")
return content
text.append(f"[Image with {len(block.image_urls)} URLs]\n")
return text
9 changes: 4 additions & 5 deletions openhands-sdk/openhands/sdk/mcp/tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
from litellm import ChatCompletionToolParam
from pydantic import Field, ValidationError

from openhands.sdk.llm import TextContent
from openhands.sdk.logger import get_logger
from openhands.sdk.mcp.client import MCPClient
from openhands.sdk.mcp.definition import MCPToolAction, MCPToolObservation
Expand Down Expand Up @@ -69,8 +68,8 @@ async def call_tool(self, action: MCPToolAction) -> MCPToolObservation:
except Exception as e:
error_msg = f"Error calling MCP tool {self.tool_name}: {str(e)}"
logger.error(error_msg, exc_info=True)
return MCPToolObservation(
content=[TextContent(text=error_msg)],
return MCPToolObservation.from_text(
text=error_msg,
is_error=True,
tool_name=self.tool_name,
)
Expand Down Expand Up @@ -154,8 +153,8 @@ def __call__(
# Surface validation errors as an observation instead of crashing
error_msg = f"Validation error for MCP tool '{self.name}' args: {e}"
logger.error(error_msg, exc_info=True)
return MCPToolObservation(
content=[TextContent(text=error_msg)],
return MCPToolObservation.from_text(
text=error_msg,
is_error=True,
tool_name=self.name,
)
Expand Down
16 changes: 7 additions & 9 deletions openhands-sdk/openhands/sdk/tool/builtins/finish.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from pydantic import Field
from rich.text import Text

from openhands.sdk.llm.message import ImageContent, TextContent
from openhands.sdk.tool.tool import (
Action,
Observation,
Expand Down Expand Up @@ -32,16 +31,15 @@ def visualize(self) -> Text:


class FinishObservation(Observation):
message: str = Field(description="Final message sent to the user.")

@property
def to_llm_content(self) -> Sequence[TextContent | ImageContent]:
return [TextContent(text=self.message)]
"""
Observation returned after finishing a task.
The FinishAction itself contains the message sent to the user so no
extra fields are needed here.
"""

@property
def visualize(self) -> Text:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if maybe we still need these two methods? 🤔

There's at least a difference in behavior: to_llm_content() re-sent the user message before, are we sure an empty message works well, or works the same?

"""Return Rich Text representation - empty since action shows the message."""
# Don't duplicate the finish message display - action already shows it
"""Return an empty Text representation since the message is in the action."""
return Text()


Expand All @@ -65,7 +63,7 @@ def __call__(
action: FinishAction,
conversation: "BaseConversation | None" = None, # noqa: ARG002
) -> FinishObservation:
return FinishObservation(message=action.message)
return FinishObservation.from_text(text=action.message)


class FinishTool(ToolDefinition[FinishAction, FinishObservation]):
Expand Down
20 changes: 7 additions & 13 deletions openhands-sdk/openhands/sdk/tool/builtins/think.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from pydantic import Field
from rich.text import Text

from openhands.sdk.llm.message import ImageContent, TextContent
from openhands.sdk.tool.tool import (
Action,
Observation,
Expand Down Expand Up @@ -46,20 +45,15 @@ def visualize(self) -> Text:


class ThinkObservation(Observation):
"""Observation returned after logging a thought."""

content: str = Field(
default="Your thought has been logged.", description="Confirmation message."
)

@property
def to_llm_content(self) -> Sequence[TextContent | ImageContent]:
return [TextContent(text=self.content)]
"""
Observation returned after logging a thought.
The ThinkAction itself contains the thought logged so no extra
fields are needed here.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔
Before:

  • after a think action, we sent ""Your thought has been logged." to the LLM.
    After:
  • after a think action, do we send anything, an empty message?

It's true that ThinkAction contains what the LLM sent us, but from the perspective of the LLM, now it has no confirmation, nothing. Maybe that works, but idk, it is a change.

"""

@property
def visualize(self) -> Text:
"""Return Rich Text representation - empty since action shows the thought."""
# Don't duplicate the thought display - action already shows it
"""Return an empty Text representation since the thought is in the action."""
return Text()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should still keep the no-op visualize function

since we don't want to display "your thought has been logged".



Expand All @@ -81,7 +75,7 @@ def __call__(
_: ThinkAction,
conversation: "BaseConversation | None" = None, # noqa: ARG002
) -> ThinkObservation:
return ThinkObservation()
return ThinkObservation.from_text(text="Your thought has been logged.")


class ThinkTool(ToolDefinition[ThinkAction, ThinkObservation]):
Expand Down
86 changes: 75 additions & 11 deletions openhands-sdk/openhands/sdk/tool/schema.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from abc import ABC, abstractmethod
from abc import ABC
from collections.abc import Sequence
from typing import Any, ClassVar, TypeVar
from typing import TYPE_CHECKING, Any, ClassVar, TypeVar

from pydantic import ConfigDict, Field, create_model
from rich.text import Text
Expand All @@ -13,6 +13,9 @@
from openhands.sdk.utils.visualize import display_dict


if TYPE_CHECKING:
from typing import Self

S = TypeVar("S", bound="Schema")


Expand Down Expand Up @@ -190,23 +193,84 @@ def visualize(self) -> Text:
class Observation(Schema, ABC):
"""Base schema for output observation."""

ERROR_MESSAGE_HEADER: ClassVar[str] = "[An error occurred during execution.]\n"

content: list[TextContent | ImageContent] = Field(
default_factory=list,
description=(
"Content returned from the tool as a list of "
"TextContent/ImageContent objects. "
"When there is an error, it should be written in this field."
),
)
is_error: bool = Field(
default=False, description="Whether the observation indicates an error"
)

@classmethod
def from_text(
cls,
text: str,
is_error: bool = False,
**kwargs: Any,
) -> "Self":
"""Utility to create an Observation from a simple text string.

Args:
text: The text content to include in the observation.
is_error: Whether this observation represents an error.
**kwargs: Additional fields for the observation subclass.

Returns:
An Observation instance with the text wrapped in a TextContent.
"""
return cls(content=[TextContent(text=text)], is_error=is_error, **kwargs)

@property
def text(self) -> str:
"""Extract all text content from the observation.

Returns:
Concatenated text from all TextContent items in content.
"""
return "".join(
item.text for item in self.content if isinstance(item, TextContent)
)

@property
@abstractmethod
def to_llm_content(self) -> Sequence[TextContent | ImageContent]:
"""Get the observation string to show to the agent."""
"""
Default content formatting for converting observation to LLM readable content.
Subclasses can override to provide richer content (e.g., images, diffs).
"""
llm_content: list[TextContent | ImageContent] = []

# If is_error is true, prepend error message
if self.is_error:
llm_content.append(TextContent(text=self.ERROR_MESSAGE_HEADER))

# Add content (now always a list)
llm_content.extend(self.content)

return llm_content

@property
def visualize(self) -> Text:
"""Return Rich Text representation of this action.
"""Return Rich Text representation of this observation.

This method can be overridden by subclasses to customize visualization.
The base implementation displays all action fields systematically.
Subclasses can override for custom visualization; by default we show the
same text that would be sent to the LLM.
"""
content = Text()
text = Text()

if self.is_error:
text.append("❌ ", style="red bold")
text.append(self.ERROR_MESSAGE_HEADER, style="bold red")

text_parts = content_to_str(self.to_llm_content)
if text_parts:
full_content = "".join(text_parts)
content.append(full_content)
text.append(full_content)
else:
content.append("[no text content]")
return content
text.append("[no text content]")
return text
22 changes: 13 additions & 9 deletions openhands-tools/openhands/tools/browser_use/definition.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,20 +28,24 @@
class BrowserObservation(Observation):
"""Base observation for browser operations."""

output: str = Field(description="The output message from the browser operation")
error: str | None = Field(default=None, description="Error message if any")
screenshot_data: str | None = Field(
default=None, description="Base64 screenshot data if available"
)

@property
def to_llm_content(self) -> Sequence[TextContent | ImageContent]:
if self.error:
return [TextContent(text=f"Error: {self.error}")]
llm_content: list[TextContent | ImageContent] = []

content: list[TextContent | ImageContent] = [
TextContent(text=maybe_truncate(self.output, MAX_BROWSER_OUTPUT_SIZE))
]
# If is_error is true, prepend error message
if self.is_error:
llm_content.append(TextContent(text=self.ERROR_MESSAGE_HEADER))

# Get text content and truncate if needed
content_text = self.text
if content_text:
llm_content.append(
TextContent(text=maybe_truncate(content_text, MAX_BROWSER_OUTPUT_SIZE))
)

if self.screenshot_data:
mime_type = "image/png"
Expand All @@ -55,9 +59,9 @@ def to_llm_content(self) -> Sequence[TextContent | ImageContent]:
mime_type = "image/webp"
# Convert base64 to data URL format for ImageContent
data_url = f"data:{mime_type};base64,{self.screenshot_data}"
content.append(ImageContent(image_urls=[data_url]))
llm_content.append(ImageContent(image_urls=[data_url]))

return content
return llm_content


# ============================================
Expand Down
Loading
Loading