-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add support for aspect ratio in gemini image generation #3412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
e7b6dec to
9a3d8b0
Compare
docs/builtin-tools.md
Outdated
|
|
||
| _(This example is complete, it can be run "as is")_ | ||
|
|
||
| To control the aspect ratio when using Gemini image models, include the `ImageGenerationTool` explicitly: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this example under "Configuration Options" please
docs/builtin-tools.md
Outdated
| |----------|-----------|-------| | ||
| | OpenAI Responses | ✅ | Full feature support. Only supported by models newer than `gpt-5`. Metadata about the generated image, like the [`revised_prompt`](https://platform.openai.com/docs/guides/tools-image-generation#revised-prompt) sent to the underlying image model, is available on the [`BuiltinToolReturnPart`][pydantic_ai.messages.BuiltinToolReturnPart] that's available via [`ModelResponse.builtin_tool_calls`][pydantic_ai.messages.ModelResponse.builtin_tool_calls]. | | ||
| | Google | ✅ | No parameter support. Only supported by [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) like `gemini-2.5-flash-image`. These models do not support [structured output](output.md) or [function tools](tools.md). These models will always generate images, even if this built-in tool is not explicitly specified. | | ||
| | Google | ✅ | Supports the `aspect_ratio` parameter when explicitly provided. Only supported by [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) like `gemini-2.5-flash-image`. These models do not support [structured output](output.md) or [function tools](tools.md) and will always generate images, even if this built-in tool is not explicitly specified. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say "Limited parameter support" like we do further up for web search.
|
|
||
| _BUILTIN_TOOL_TYPES: dict[str, type[AbstractBuiltinTool]] = {} | ||
|
|
||
| ImageAspectRatio = Literal['21:9', '16:9', '4:3', '3:2', '1:1', '9:16', '3:4', '2:3', '5:4', '4:5'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing the google-genai SDK does not have a type for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, right now it is just:
class ImageConfigDict(TypedDict, total=False):
"""The image generation configuration to be used in GenerateContentConfig."""
aspect_ratio: Optional[str]
"""Aspect ratio of the generated images. Supported values are
"1:1", "2:3", "3:2", "3:4", "4:3", "9:16", "16:9", and "21:9"."""
```
| Supported by: | ||
| * Google image-generation models (Gemini) when the tool is explicitly enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop "when the tool is explicitly enabled." as that's implied by this being on that builtin tool class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could support some of these values for OpenAI as well by mapping to one of the size options. Then we'd only need to raise an error from OpenAI if another value is used, or if size and aspect_ratio are used at the same time.
| ) | ||
| if tool.aspect_ratio: | ||
| if image_config and image_config.get('aspect_ratio') != tool.aspect_ratio: | ||
| raise UserError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really only support a single instance anyway, so we can drop this and just always set image_config
| response_schema=response_schema, | ||
| response_modalities=modalities, | ||
| ) | ||
| config: GenerateContentConfigDict = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we have to change how this is built?
Fix for #3119