Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pages/toolkits/databases/_meta.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
export default {
postgres: "Postgres",
mongodb: "MongoDB",
clickhouse: "Clickhouse",
};
220 changes: 220 additions & 0 deletions pages/toolkits/databases/mongodb.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
# MongoDB

import ToolInfo from "@/components/ToolInfo";
import Badges from "@/components/Badges";
import TabbedCodeBlock from "@/components/TabbedCodeBlock";
import TableOfContents from "@/components/TableOfContents";
import ToolFooter from "@/components/ToolFooter";

<ToolInfo
description="Enable agents to interact with MongoDB databases (read only)."
author="Arcade"
codeLink="https://github.com/ArcadeAI/arcade-ai/tree/main/toolkits/mongodb"
authType="database connection string"
versions={["0.1.0"]}
/>

<Badges repo="arcadeai/arcade-mongodb" />

The Arcade MongoDB toolkit provides a pre-built set of tools for interacting with MongoDB databases in a read-only manner. This toolkit enables agents to discover databases and collections, explore document structures, and execute queries safely. This toolkit is a companion to the blog post [Designing SQL Tools for AI Agents](https://blog.arcade.dev/sql-tools-ai-agents-security).

<Note>
This toolkit is meant to be an example of how to build a toolkit for a database, and is not intended to be used in production - you won't find it listed in the Arcade dashboard or APIs. For production use, we recommend forking this repository and building your own toolkit with use-case specific tools.
</Note>

## Key Features

This toolkit demonstrates several important concepts for LLM-powered database interactions:

* **Database Discovery**: Automatically discover available databases in the MongoDB instance
* **Collection Exploration**: Find all collections within a specific database
* **Schema Inference**: Sample documents to infer schema structure and data types
* **Safe Query Execution**: Execute find queries with built-in safety measures
* **Aggregation Support**: Run complex aggregation pipelines for data analysis
* **Document Counting**: Count documents matching specific criteria
* **Connection Pooling**: Reuse database connections efficiently
* **Read-Only Access**: Enforce read-only access to prevent data modification
* **Result Limits**: Automatically limit query results to prevent overwhelming responses

## Available Tools

<TableOfContents
headers={["Tool Name", "Description"]}
data={
[
['MongoDB.DiscoverDatabases', "Discover all databases in the MongoDB instance."],
['MongoDB.DiscoverCollections', "Discover all collections in a specific database."],
['MongoDB.GetCollectionSchema', "Get the schema structure of a collection by sampling documents."],
['MongoDB.FindDocuments', "Find documents in a collection with filtering, projection, and sorting."],
['MongoDB.CountDocuments', "Count documents matching a specific filter."],
['MongoDB.AggregateDocuments', "Execute aggregation pipelines for complex data analysis."],
]
}
/>

Note that all tools require the `MONGODB_CONNECTION_STRING` secret to be set.

## MongoDB.DiscoverDatabases

Discover all databases in the MongoDB instance. This tool returns a list of all available databases, excluding system databases like `admin`, `config`, and `local` for security.

<TabbedCodeBlock
tabs={[
{
label: "Call the Tool",
content: {
Python: [
"/examples/integrations/toolkits/mongodb/discover_databases_example_call_tool.py",
],
JavaScript: ["/examples/integrations/toolkits/mongodb/discover_databases_example_call_tool.js"],
},
}
]}
/>

## MongoDB.DiscoverCollections

Discover all collections in a specific database. This tool should be used before any other tool that requires a collection name.

**Parameters:**
- `database_name` (str): The database name to discover collections in

<TabbedCodeBlock
tabs={[
{
label: "Call the Tool",
content: {
Python: [
"/examples/integrations/toolkits/mongodb/discover_collections_example_call_tool.py",
],
JavaScript: ["/examples/integrations/toolkits/mongodb/discover_collections_example_call_tool.js"],
},
}
]}
/>

## MongoDB.GetCollectionSchema

Get the schema structure of a collection by sampling documents. Since MongoDB is schema-less, this tool samples a configurable number of documents to infer the schema structure and data types. Always use this tool before executing any query.

**Parameters:**
- `database_name` (str): The database name containing the collection
- `collection_name` (str): The name of the collection to inspect
- `sample_size` (int): The number of documents to sample for schema discovery (default: 100)

<TabbedCodeBlock
tabs={[
{
label: "Call the Tool",
content: {
Python: [
"/examples/integrations/toolkits/mongodb/get_collection_schema_example_call_tool.py",
],
JavaScript: ["/examples/integrations/toolkits/mongodb/get_collection_schema_example_call_tool.js"],
},
}
]}
/>

## MongoDB.FindDocuments

Find documents in a collection with filtering, projection, and sorting. This tool allows you to build complex queries using MongoDB's query operators while maintaining safety and performance.

**Parameters:**
- `database_name` (str): The database name to query
- `collection_name` (str): The collection name to query
- `filter_dict` (str, optional): MongoDB filter/query as JSON string. Leave None for no filter
- `projection` (str, optional): Fields to include/exclude as JSON string. Use 1 to include, 0 to exclude
- `sort` (list[str], optional): Sort criteria as list of JSON strings with 'field' and 'direction' keys
- `limit` (int): Maximum number of documents to return (default: 100)
- `skip` (int): Number of documents to skip (default: 0)

**Best Practices:**
- Always use `discover_collections` and `get_collection_schema` before executing queries
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these could be hyperlinks to the respective tools and use their SnakeCase names

- Always specify projection to limit fields returned if you don't need all data
- Always sort your results by the most important fields first
- Use appropriate MongoDB query operators for complex filtering ($gte, $lte, $in, $regex, etc.)
- Be mindful of case sensitivity when querying string fields
- Use indexes when possible (typically on _id and commonly queried fields)

<TabbedCodeBlock
tabs={[
{
label: "Call the Tool",
content: {
Python: [
"/examples/integrations/toolkits/mongodb/find_documents_example_call_tool.py",
],
JavaScript: ["/examples/integrations/toolkits/mongodb/find_documents_example_call_tool.js"],
},
}
]}
/>

## MongoDB.CountDocuments

Count documents in a collection matching the given filter. This tool is useful for getting quick counts without retrieving the actual documents.

**Parameters:**
- `database_name` (str): The database name to query
- `collection_name` (str): The collection name to query
- `filter_dict` (str, optional): MongoDB filter/query as JSON string. Leave None to count all documents

<TabbedCodeBlock
tabs={[
{
label: "Call the Tool",
content: {
Python: [
"/examples/integrations/toolkits/mongodb/count_documents_example_call_tool.py",
],
JavaScript: ["/examples/integrations/toolkits/mongodb/count_documents_example_call_tool.js"],
},
}
]}
/>

## MongoDB.AggregateDocuments

Execute aggregation pipelines for complex data analysis. This tool allows you to run sophisticated data processing operations including grouping, filtering, and transformations.

**Parameters:**
- `database_name` (str): The database name to query
- `collection_name` (str): The collection name to query
- `pipeline` (list[str]): MongoDB aggregation pipeline as a list of JSON strings
- `limit` (int): Maximum number of results to return (default: 100)

**Common Aggregation Stages:**
- `$match` - filter documents
- `$group` - group documents and perform calculations
- `$project` - reshape documents
- `$sort` - sort documents
- `$limit` - limit results
- `$lookup` - join with other collections

<TabbedCodeBlock
tabs={[
{
label: "Call the Tool",
content: {
Python: [
"/examples/integrations/toolkits/mongodb/aggregate_documents_example_call_tool.py",
],
JavaScript: ["/examples/integrations/toolkits/mongodb/aggregate_documents_example_call_tool.js"],
},
}
]}
/>

## Usage Workflow

For optimal results, follow this workflow when using the MongoDB toolkit:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot, I will make a note to add this to section in the future for other toolkits!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: These could be links to the tools and use SnakeCase maybe


1. **Discover Databases**: Use `discover_databases` to see available databases
2. **Discover Collections**: Use `discover_collections` with your target database
3. **Get Collection Schema**: Use `get_collection_schema` for each collection you plan to query
4. **Execute Queries**: Use `find_documents`, `count_documents`, or `aggregate_documents` with the schema information

This workflow ensures your agent has complete information about the database structure before attempting queries, reducing errors and improving query performance.

<ToolFooter pipPackageName="arcade_mongodb"/>
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { Arcade } from "arcadejs";

const client = new Arcade(); // Automatically finds the `ARCADE_API_KEY` env variable

const TOOL_NAME = "MongoDB.AggregateDocuments";

const userId = "{arcade_user_id}";

const toolInput = {
database_name: "my_database",
collection_name: "users",
pipeline: [
JSON.stringify({"$match": {"status": "active"}}),
JSON.stringify({"$group": {"_id": "$category", "count": {"$sum": 1}}}),
JSON.stringify({"$sort": {"count": -1}})
],
limit: 20
};

const response = await client.tools.execute({
toolName: TOOL_NAME,
input: toolInput,
userId: userId,
});

console.log(response.output.value);
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import json
from arcadepy import Arcade

client = Arcade() # Automatically finds the `ARCADE_API_KEY` env variable

TOOL_NAME = "MongoDB.AggregateDocuments"

user_id = "{arcade_user_id}"

tool_input = {
"database_name": "my_database",
"collection_name": "users",
"pipeline": [
json.dumps({"$match": {"status": "active"}}),
json.dumps({"$group": {"_id": "$category", "count": {"$sum": 1}}}),
json.dumps({"$sort": {"count": -1}})
],
"limit": 20
}

response = client.tools.execute(
tool_name=TOOL_NAME,
input=tool_input,
user_id=user_id,
)

print(response.output.value)
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import { Arcade } from "arcadejs";

const client = new Arcade(); // Automatically finds the `ARCADE_API_KEY` env variable

const TOOL_NAME = "MongoDB.CountDocuments";

const userId = "{arcade_user_id}";

const toolInput = {
database_name: "my_database",
collection_name: "users",
filter_dict: JSON.stringify({"status": "active"})
};

const response = await client.tools.execute({
toolName: TOOL_NAME,
input: toolInput,
userId: userId,
});

console.log(response.output.value);
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import json
from arcadepy import Arcade

client = Arcade() # Automatically finds the `ARCADE_API_KEY` env variable

TOOL_NAME = "MongoDB.CountDocuments"

user_id = "{arcade_user_id}"

tool_input = {
"database_name": "my_database",
"collection_name": "users",
"filter_dict": json.dumps({"status": "active"})
}

response = client.tools.execute(
tool_name=TOOL_NAME,
input=tool_input,
user_id=user_id,
)

print(response.output.value)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import { Arcade } from "arcadejs";

const client = new Arcade(); // Automatically finds the `ARCADE_API_KEY` env variable

const TOOL_NAME = "MongoDB.DiscoverCollections";

const userId = "{arcade_user_id}";

const toolInput = {
database_name: "my_database"
};

const response = await client.tools.execute({
toolName: TOOL_NAME,
input: toolInput,
userId: userId,
});

console.log(response.output.value);
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from arcadepy import Arcade

client = Arcade() # Automatically finds the `ARCADE_API_KEY` env variable

TOOL_NAME = "MongoDB.DiscoverCollections"

user_id = "{arcade_user_id}"

tool_input = {
"database_name": "my_database"
}

response = client.tools.execute(
tool_name=TOOL_NAME,
input=tool_input,
user_id=user_id,
)

print(response.output.value)
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import { Arcade } from "arcadejs";

const client = new Arcade(); // Automatically finds the `ARCADE_API_KEY` env variable

const TOOL_NAME = "MongoDB.DiscoverDatabases";

const userId = "{arcade_user_id}";

const toolInput = {};

const response = await client.tools.execute({
toolName: TOOL_NAME,
input: toolInput,
userId: userId,
});

console.log(response.output.value);
Loading