Skip to content

Conversation

sameelarif
Copy link
Member

why

At the moment, Stagehand excels at browser automation but is limited in being able to interact with third-party services. Many production use cases require supporting these integrations. As the scope of browser agents expand, it’s becoming increasingly important that agents adapt to the web rather than the web adapting to them.

what changed

Added support for integrations (MCPs) and custom tools to be passed in to the Stagehand agent. Here are some examples:

const agent = stagehand.agent({
  integrations: [
    "https://www.example.com/mcp",
  ],
});

The above syntax will internally create the MCP connection. However, if you prefer a self-managed connection, there are two more options available. The first option is to use our built-in util to establish a connection:

import { connectToMCPServer } from "@browserbasehq/stagehand"

const exampleClient = await connectToMCPServer("https://www.example.com/mcp")

const agent = stagehand.agent({
  integrations: [exampleClient],
});

The alternative is to establish a connection to the MCP server with your own client. Stagehand takes in a Client object from the @modelcontextprotocol/sdk package. Please refer to their documentation for more information.

test plan

Internal agent evals.

Copy link

changeset-bot bot commented Aug 21, 2025

⚠️ No Changeset found

Latest commit: 0382bae

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This pull request adds Model Context Protocol (MCP) and custom tool support to Stagehand agents, significantly expanding the framework's capabilities beyond browser automation to include third-party service integrations. The changes introduce a flexible architecture where agents can connect to external MCP servers and execute custom tools alongside existing browser automation functions.

The implementation provides three ways to configure integrations: (1) simple string URLs that are internally converted to MCP connections, (2) pre-established connections using the built-in connectToMCPServer utility, and (3) custom Client objects from the @modelcontextprotocol/sdk. The agent configuration now accepts integrations and tools parameters in the AgentConfig interface.

Core architectural changes include:

  • New MCP utilities: lib/mcp/connection.ts provides connectToMCPServer for establishing connections, while lib/mcp/utils.ts implements resolveTools to convert MCP tools into AI SDK's ToolSet format
  • Agent client updates: Both OpenAI and Anthropic CUA clients now accept and execute custom tools alongside their computer use functionality
  • Operator handler refactor: Complete rewrite from schema-based responses to a tool-calling architecture that combines built-in Stagehand operations (act, extract, goto, etc.) with external MCP tools
  • Type system enhancements: New error handling with MCPConnectionError, JSON Schema to Zod conversion utilities, and updated type definitions

The changes maintain backward compatibility while enabling agents to adapt to complex web environments requiring external API access, search capabilities, database interactions, and other third-party services. Examples demonstrate real-world usage with Exa AI search integration.

Confidence score: 4/5

  • This PR introduces complex new functionality but appears well-architected with proper error handling and flexible API design
  • Score reflects the significant architectural changes in critical files like the operator handler, though the implementation follows good patterns
  • Pay close attention to lib/handlers/operatorHandler.ts and the MCP connection utilities for potential integration issues

18 files reviewed, 3 comments

Edit Code Review Bot Settings | Greptile

}
case "string": {
if (schema.enum) {
return z.string().refine((val) => schema.enum!.includes(val));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Using .refine() for enum validation instead of .enum() - this works but .enum() would be more direct for string enums

sameelarif and others added 3 commits August 22, 2025 09:20
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@@ -949,6 +940,27 @@ export class Stagehand {

return {
execute: async (instructionOrOptions: string | AgentExecuteOptions) => {
const tools = options?.integrations
? await resolveTools(options?.integrations, options?.tools)
: (options?.tools ?? {});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably gate tools behind experimental:true until we roll out api support

clientInstance = client;
}

let nextCursor: string | undefined = undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does cursor stand for? a pointer to iterate through the listed tools?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's for handling pagination of the tools

@@ -42,7 +44,7 @@ export class StagehandAgentHandler {
options.modelName,
options.clientOptions || {},
options.userProvidedInstructions,
options.experimental,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to keep experimental for context pruning? or now that it's deployed it's obsolete

}),
},
executeOptions,
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

table into separate pr once deployed

@miguelg719
Copy link
Collaborator

e2e tests failing, need to double check

Comment on lines 324 to -331

// Add the assistant message with tool_use blocks to the history
if (this.experimental) {
compressConversationImages(nextInputItems);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this removes image compression. don't remove, probably good to move out of experimental though

Comment on lines +26 to +37
const agent = stagehand.agent({
provider: "openai",
model: "computer-use-preview",
integrations: [
`https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`,
],
instructions: `You have access to web search through Exa. Use it to find current information before browsing.`,
options: {
apiKey: process.env.OPENAI_API_KEY,
},
});

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I run this I get Error executing tool web_search_exa: MCP error -32602: MCP error -32602: Invalid arguments for tool web_search_exa: [

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants