From 391f88114d282f43cbe420a955f2ec4915358973 Mon Sep 17 00:00:00 2001 From: cirilla-zmh Date: Mon, 22 Sep 2025 21:38:29 +0800 Subject: [PATCH] Add json schema of gen ai tool definitions --- docs/gen-ai/aws-bedrock.md | 21 +++++-- docs/gen-ai/azure-ai-inference.md | 21 +++++-- docs/gen-ai/gen-ai-agent-spans.md | 21 +++++-- docs/gen-ai/gen-ai-events.md | 21 +++++-- docs/gen-ai/gen-ai-spans.md | 21 +++++-- docs/gen-ai/gen-ai-tool-definitions.json | 59 +++++++++++++++++++ .../non-normative/examples-llm-calls.md | 25 ++++++++ docs/gen-ai/non-normative/models.ipynb | 49 ++++++++++++++- docs/gen-ai/openai.md | 21 +++++-- docs/registry/attributes/gen-ai.md | 21 +++++-- model/gen-ai/registry.yaml | 21 +++++-- 11 files changed, 251 insertions(+), 50 deletions(-) create mode 100644 docs/gen-ai/gen-ai-tool-definitions.json diff --git a/docs/gen-ai/aws-bedrock.md b/docs/gen-ai/aws-bedrock.md index 6bbe413fa0..6f2bd5ad0b 100644 --- a/docs/gen-ai/aws-bedrock.md +++ b/docs/gen-ai/aws-bedrock.md @@ -194,13 +194,22 @@ section for more details. **[14] `gen_ai.tool.definitions`:** The value of this attribute matches source system tool definition format. -It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available -to the instrumentation, the instrumentation SHOULD do the best effort to -deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. +It's expected to be an array of objects, each representing a tool definition, +and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). +In case a serialized string is available to the instrumentation, the instrumentation +SHOULD do the best effort to deserialize it to an array. -Since this attribute could be large, it's NOT RECOMMENDED to populate -it by default. Instrumentations MAY provide a way to enable -populating this attribute. +When the attribute is recorded on events, it MUST be recorded in structured +form. When recorded on spans, it MAY be recorded as a JSON string if structured +format is not supported and SHOULD be recorded in structured form otherwise. + +If instrumentations can reliably deserialize and extract the tool definitions, +it's RECOMMENDED to only populate required fields of the definition objects +by default. Otherwise, it's NOT RECOMMENDED to populate it by default. +Instrumentations MAY provide a way to enable populating this attribute. + +> [!Warning] +> This attribute is likely to contain sensitive information including user/PII data. --- diff --git a/docs/gen-ai/azure-ai-inference.md b/docs/gen-ai/azure-ai-inference.md index ea3d446b62..3ce8bed1b1 100644 --- a/docs/gen-ai/azure-ai-inference.md +++ b/docs/gen-ai/azure-ai-inference.md @@ -195,13 +195,22 @@ section for more details. **[14] `gen_ai.tool.definitions`:** The value of this attribute matches source system tool definition format. -It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available -to the instrumentation, the instrumentation SHOULD do the best effort to -deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. +It's expected to be an array of objects, each representing a tool definition, +and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). +In case a serialized string is available to the instrumentation, the instrumentation +SHOULD do the best effort to deserialize it to an array. -Since this attribute could be large, it's NOT RECOMMENDED to populate -it by default. Instrumentations MAY provide a way to enable -populating this attribute. +When the attribute is recorded on events, it MUST be recorded in structured +form. When recorded on spans, it MAY be recorded as a JSON string if structured +format is not supported and SHOULD be recorded in structured form otherwise. + +If instrumentations can reliably deserialize and extract the tool definitions, +it's RECOMMENDED to only populate required fields of the definition objects +by default. Otherwise, it's NOT RECOMMENDED to populate it by default. +Instrumentations MAY provide a way to enable populating this attribute. + +> [!Warning] +> This attribute is likely to contain sensitive information including user/PII data. --- diff --git a/docs/gen-ai/gen-ai-agent-spans.md b/docs/gen-ai/gen-ai-agent-spans.md index 2eaeaa51a3..9baa37a125 100644 --- a/docs/gen-ai/gen-ai-agent-spans.md +++ b/docs/gen-ai/gen-ai-agent-spans.md @@ -334,13 +334,22 @@ section for more details. **[15] `gen_ai.tool.definitions`:** The value of this attribute matches source system tool definition format. -It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available -to the instrumentation, the instrumentation SHOULD do the best effort to -deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. +It's expected to be an array of objects, each representing a tool definition, +and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). +In case a serialized string is available to the instrumentation, the instrumentation +SHOULD do the best effort to deserialize it to an array. -Since this attribute could be large, it's NOT RECOMMENDED to populate -it by default. Instrumentations MAY provide a way to enable -populating this attribute. +When the attribute is recorded on events, it MUST be recorded in structured +form. When recorded on spans, it MAY be recorded as a JSON string if structured +format is not supported and SHOULD be recorded in structured form otherwise. + +If instrumentations can reliably deserialize and extract the tool definitions, +it's RECOMMENDED to only populate required fields of the definition objects +by default. Otherwise, it's NOT RECOMMENDED to populate it by default. +Instrumentations MAY provide a way to enable populating this attribute. + +> [!Warning] +> This attribute is likely to contain sensitive information including user/PII data. --- diff --git a/docs/gen-ai/gen-ai-events.md b/docs/gen-ai/gen-ai-events.md index 0ccae8ab5b..9dd1201204 100644 --- a/docs/gen-ai/gen-ai-events.md +++ b/docs/gen-ai/gen-ai-events.md @@ -175,13 +175,22 @@ section for more details. **[13] `gen_ai.tool.definitions`:** The value of this attribute matches source system tool definition format. -It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available -to the instrumentation, the instrumentation SHOULD do the best effort to -deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. +It's expected to be an array of objects, each representing a tool definition, +and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). +In case a serialized string is available to the instrumentation, the instrumentation +SHOULD do the best effort to deserialize it to an array. -Since this attribute could be large, it's NOT RECOMMENDED to populate -it by default. Instrumentations MAY provide a way to enable -populating this attribute. +When the attribute is recorded on events, it MUST be recorded in structured +form. When recorded on spans, it MAY be recorded as a JSON string if structured +format is not supported and SHOULD be recorded in structured form otherwise. + +If instrumentations can reliably deserialize and extract the tool definitions, +it's RECOMMENDED to only populate required fields of the definition objects +by default. Otherwise, it's NOT RECOMMENDED to populate it by default. +Instrumentations MAY provide a way to enable populating this attribute. + +> [!Warning] +> This attribute is likely to contain sensitive information including user/PII data. --- diff --git a/docs/gen-ai/gen-ai-spans.md b/docs/gen-ai/gen-ai-spans.md index 2c33f05809..c1f00f6760 100644 --- a/docs/gen-ai/gen-ai-spans.md +++ b/docs/gen-ai/gen-ai-spans.md @@ -207,13 +207,22 @@ section for more details. **[14] `gen_ai.tool.definitions`:** The value of this attribute matches source system tool definition format. -It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available -to the instrumentation, the instrumentation SHOULD do the best effort to -deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. +It's expected to be an array of objects, each representing a tool definition, +and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). +In case a serialized string is available to the instrumentation, the instrumentation +SHOULD do the best effort to deserialize it to an array. + +When the attribute is recorded on events, it MUST be recorded in structured +form. When recorded on spans, it MAY be recorded as a JSON string if structured +format is not supported and SHOULD be recorded in structured form otherwise. + +If instrumentations can reliably deserialize and extract the tool definitions, +it's RECOMMENDED to only populate required fields of the definition objects +by default. Otherwise, it's NOT RECOMMENDED to populate it by default. +Instrumentations MAY provide a way to enable populating this attribute. -Since this attribute could be large, it's NOT RECOMMENDED to populate -it by default. Instrumentations MAY provide a way to enable -populating this attribute. +> [!Warning] +> This attribute is likely to contain sensitive information including user/PII data. --- diff --git a/docs/gen-ai/gen-ai-tool-definitions.json b/docs/gen-ai/gen-ai-tool-definitions.json new file mode 100644 index 0000000000..deec08c382 --- /dev/null +++ b/docs/gen-ai/gen-ai-tool-definitions.json @@ -0,0 +1,59 @@ +{ + "$defs": { + "ToolDefinition": { + "additionalProperties": true, + "properties": { + "type": { + "anyOf": [ + { + "$ref": "#/$defs/ToolType" + }, + { + "type": "string" + } + ], + "description": "Type of the tool.", + "title": "ToolType" + }, + "name": { + "description": "Name of the tool.", + "title": "Name", + "type": "string" + }, + "description": { + "description": "Description of the tool.", + "title": "Description", + "type": "string" + }, + "parameters": { + "description": "Format of the tool parameters. Maybe it is a JSON schema.", + "title": "Parameters" + }, + "response": { + "description": "Format of the tool response. Maybe it is a JSON schema.", + "title": "Response" + } + }, + "required": [ + "type", + "name" + ], + "title": "ToolDefinition", + "type": "object" + }, + "ToolType": { + "enum": [ + "function", + "custom" + ], + "title": "ToolType", + "type": "string" + } + }, + "description": "Represents the list of tool definitions sent to the model.", + "items": { + "$ref": "#/$defs/ToolDefinition" + }, + "title": "ToolDefinitions", + "type": "array" +} \ No newline at end of file diff --git a/docs/gen-ai/non-normative/examples-llm-calls.md b/docs/gen-ai/non-normative/examples-llm-calls.md index 9965ffb6f2..3c5b7d256b 100644 --- a/docs/gen-ai/non-normative/examples-llm-calls.md +++ b/docs/gen-ai/non-normative/examples-llm-calls.md @@ -298,6 +298,7 @@ They are likely to be siblings if there is an encompassing span. | `gen_ai.response.finish_reasons`| `["tool_calls"]` | | `gen_ai.input.messages` | [`gen_ai.input.messages`](#gen-ai-input-messages-tool-call-span-1) | | `gen_ai.output.messages` | [`gen_ai.output.messages`](#gen-ai-output-messages-tool-call-span-1) | +| `gen_ai.tool.definitions` | [`gen_ai.tool.definitions`](#gen-ai-tool-definitions-tool-call-span-1) | `gen_ai.input.messages` value @@ -336,6 +337,30 @@ They are likely to be siblings if there is an encompassing span. ] ``` +`gen_ai.tool.definitions` value + +```json +[ + { + "type": "function", + "name": "get_weather", + "description": "Get the weather in a given location", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "The city and state, e.g. San Francisco, CA" + }, + "required": [ + "location" + ] + } + } + } +] +``` + **Tool call:** If tool call is [instrumented according to execute-tool span definition](../gen-ai-spans.md#execute-tool-span), it may look like this: diff --git a/docs/gen-ai/non-normative/models.ipynb b/docs/gen-ai/non-normative/models.ipynb index 891bed9423..c40c81934f 100644 --- a/docs/gen-ai/non-normative/models.ipynb +++ b/docs/gen-ai/non-normative/models.ipynb @@ -41,7 +41,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "id": "5124fe15", "metadata": {}, "outputs": [], @@ -117,7 +117,24 @@ " description=\"List of message parts that make up the message content.\")\n", "\n", " class Config:\n", - " extra = \"allow\"" + " extra = \"allow\"\n", + "\n", + "class ToolType(str, Enum):\n", + " FUNCTION = \"function\"\n", + " CUSTOM = \"custom\"\n", + "\n", + "class ToolDefinition(BaseModel):\n", + " \"\"\"\n", + " Represents a tool definition.\n", + " \"\"\"\n", + " type: Union[ToolType, str] = Field(description=\"Type of the tool.\")\n", + " name: str = Field(description=\"Name of the tool.\")\n", + " description: str = Field(description=\"Description of the tool.\")\n", + " parameters: Any = Field(description=\"Format of the tool parameters. Maybe it is a JSON schema.\")\n", + " response: Any = Field(description=\"Format of the tool response. Maybe it is a JSON schema.\")\n", + "\n", + " class Config:\n", + " extra = \"allow\"\n" ] }, { @@ -222,6 +239,34 @@ "# Print the JSON schema for the SystemInstructions model\n", "print(json.dumps(SystemInstructions.model_json_schema(), indent=4))" ] + }, + { + "cell_type": "markdown", + "id": "f019c33a", + "metadata": {}, + "source": [ + "## `gen_ai.tool.definitions` model\n", + "\n", + "Corresponding attribute: [`gen_ai.tool.definitions`](/docs/registry/attributes/gen-ai.md#gen-ai-tool-definitions).\n", + "JSON schema: [`gen_ai-tool-definitions.json`](../gen-ai-tool-definitions.json)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a9e84726", + "metadata": {}, + "outputs": [], + "source": [ + "class ToolDefinitions(RootModel[List[ToolDefinition]]):\n", + " \"\"\"\n", + " Represents the list of tool definitions available to the GenAI agent or model.\n", + " \"\"\"\n", + " pass\n", + "\n", + "# Print the JSON schema for the ToolDefinitions model\n", + "print(json.dumps(ToolDefinitions.model_json_schema(), indent=4))" + ] } ], "metadata": { diff --git a/docs/gen-ai/openai.md b/docs/gen-ai/openai.md index 4af34bd316..49b3f13521 100644 --- a/docs/gen-ai/openai.md +++ b/docs/gen-ai/openai.md @@ -200,13 +200,22 @@ section for more details. **[15] `gen_ai.tool.definitions`:** The value of this attribute matches source system tool definition format. -It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available -to the instrumentation, the instrumentation SHOULD do the best effort to -deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. +It's expected to be an array of objects, each representing a tool definition, +and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). +In case a serialized string is available to the instrumentation, the instrumentation +SHOULD do the best effort to deserialize it to an array. -Since this attribute could be large, it's NOT RECOMMENDED to populate -it by default. Instrumentations MAY provide a way to enable -populating this attribute. +When the attribute is recorded on events, it MUST be recorded in structured +form. When recorded on spans, it MAY be recorded as a JSON string if structured +format is not supported and SHOULD be recorded in structured form otherwise. + +If instrumentations can reliably deserialize and extract the tool definitions, +it's RECOMMENDED to only populate required fields of the definition objects +by default. Otherwise, it's NOT RECOMMENDED to populate it by default. +Instrumentations MAY provide a way to enable populating this attribute. + +> [!Warning] +> This attribute is likely to contain sensitive information including user/PII data. --- diff --git a/docs/registry/attributes/gen-ai.md b/docs/registry/attributes/gen-ai.md index abc61bcf9b..61d2217f7f 100644 --- a/docs/registry/attributes/gen-ai.md +++ b/docs/registry/attributes/gen-ai.md @@ -157,13 +157,22 @@ deserialize it to an object. When recorded on spans, it MAY be recorded as a JSO **[12] `gen_ai.tool.definitions`:** The value of this attribute matches source system tool definition format. -It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available -to the instrumentation, the instrumentation SHOULD do the best effort to -deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. +It's expected to be an array of objects, each representing a tool definition, +and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). +In case a serialized string is available to the instrumentation, the instrumentation +SHOULD do the best effort to deserialize it to an array. + +When the attribute is recorded on events, it MUST be recorded in structured +form. When recorded on spans, it MAY be recorded as a JSON string if structured +format is not supported and SHOULD be recorded in structured form otherwise. + +If instrumentations can reliably deserialize and extract the tool definitions, +it's RECOMMENDED to only populate required fields of the definition objects +by default. Otherwise, it's NOT RECOMMENDED to populate it by default. +Instrumentations MAY provide a way to enable populating this attribute. -Since this attribute could be large, it's NOT RECOMMENDED to populate -it by default. Instrumentations MAY provide a way to enable -populating this attribute. +> [!Warning] +> This attribute is likely to contain sensitive information including user/PII data. **[13] `gen_ai.tool.type`:** Extension: A tool executed on the agent-side to directly call external APIs, bridging the gap between the agent and real-world systems. Agent-side operations involve actions that are performed by the agent on the server or within the agent's controlled environment. diff --git a/model/gen-ai/registry.yaml b/model/gen-ai/registry.yaml index 8b3fa51dd4..c5e658f9cd 100644 --- a/model/gen-ai/registry.yaml +++ b/model/gen-ai/registry.yaml @@ -303,13 +303,22 @@ groups: note: | The value of this attribute matches source system tool definition format. - It's expected to be an array of objects where each object represents a tool definition. In case a serialized string is available - to the instrumentation, the instrumentation SHOULD do the best effort to - deserialize it to an array. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise. + It's expected to be an array of objects, each representing a tool definition, + and the structure of the array is expected to match the [Tool Definitions JSON Schema](/docs/gen-ai/gen-ai-tool-definitions.json). + In case a serialized string is available to the instrumentation, the instrumentation + SHOULD do the best effort to deserialize it to an array. + + When the attribute is recorded on events, it MUST be recorded in structured + form. When recorded on spans, it MAY be recorded as a JSON string if structured + format is not supported and SHOULD be recorded in structured form otherwise. + + If instrumentations can reliably deserialize and extract the tool definitions, + it's RECOMMENDED to only populate required fields of the definition objects + by default. Otherwise, it's NOT RECOMMENDED to populate it by default. + Instrumentations MAY provide a way to enable populating this attribute. - Since this attribute could be large, it's NOT RECOMMENDED to populate - it by default. Instrumentations MAY provide a way to enable - populating this attribute. + > [!Warning] + > This attribute is likely to contain sensitive information including user/PII data. examples: - | [