Granular Per-Agent Speech Configuration #3170

qyuo · 2025-10-14T20:55:46Z

Addresses Feature Request: #3116

This PR adds a speech_config to the LLM Agent configuration for the live use case. When an asynchronous LLM call is made to the Gemini Live API, it prioritizes the most specific agent configuration's speech_config. If that is null, it then uses the run configuration's speech_config. Unit tests have been added to verify this behavior.

gemini-code-assist · 2025-10-14T20:56:01Z

Summary of Changes

Hello @qyuo, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the flexibility of LLM agent configurations by enabling granular, per-agent speech settings. It ensures that during live interactions, an agent's specific speech configuration is prioritized over the general run configuration, providing more precise control over how individual agents handle speech input and output. This change directly addresses a feature request to allow agents to define their own speech parameters.

Highlights

Granular Speech Configuration: Introduced a new speech_config field to the LlmAgent and LlmAgentConfig classes, allowing individual agents to have their own speech settings.
Speech Config Priority: Implemented a priority system where an agent's speech_config takes precedence over the global run_config's speech settings during asynchronous LLM calls to the Gemini Live API.
Usage Warning: Added a warning mechanism that logs a message if an agent has a speech_config set, reminding users that this configuration is only effective in live/streaming modes with compatible models.
Comprehensive Unit Tests: Included new unit tests to thoroughly verify the behavior of the speech_config priority, fallback logic, and the warning system.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a speech_config at the LlmAgent level, providing more granular control over speech settings. The implementation correctly prioritizes the agent-specific configuration over the global run configuration. The changes are well-supported by unit tests that cover various scenarios. I have a couple of suggestions to enhance the code's robustness against potential null values and to improve the maintainability of the new tests.

src/google/adk/flows/llm_flows/basic.py

tests/unittests/flows/llm_flows/test_basic_processor.py

hangfei · 2025-10-14T22:45:09Z

If there are multiple agents with different speech config, will that work?

hangfei · 2025-10-14T22:46:24Z

Could you update this sample to have different configs for each agent: https://github.com/google/adk-python/tree/main/contributing/samples/live_bidi_streaming_multi_agent?

qyuo · 2025-10-14T23:17:35Z

If there are multiple agents with different speech config, will that work?

Yes. They will reply with a different voice based on which one provides the final output.

Feature with Unit Test

8bc63b1

gemini-code-assist bot reviewed Oct 14, 2025

View reviewed changes

src/google/adk/flows/llm_flows/basic.py Outdated Show resolved Hide resolved

tests/unittests/flows/llm_flows/test_basic_processor.py Outdated Show resolved Hide resolved

Merge branch 'main' into bidi_agent_speech_config

c5deca8

Merge branch 'main' into bidi_agent_speech_config

117bdbd

Luis Pabon and others added 2 commits October 14, 2025 23:18

Added voices to sample

02d9ab3

Update run_config.py

626e86b

hangfei approved these changes Oct 15, 2025

View reviewed changes

hangfei self-assigned this Oct 15, 2025

hangfei added the ready to pull [Status] This PR is ready to be importing back to Google label Oct 15, 2025

Update run_config.py

5c1fc02

hangfei removed the ready to pull [Status] This PR is ready to be importing back to Google label Oct 15, 2025

Merge branch 'main' into bidi_agent_speech_config

187b4cc

hangfei approved these changes Oct 15, 2025

View reviewed changes

hangfei added the ready to pull [Status] This PR is ready to be importing back to Google label Oct 15, 2025

Luis Pabon added 4 commits October 15, 2025 22:35

Moved to Gemini class; run_config overrides

54cd5a2

Moved to Gemini class; run_config overrides

38a5494

Make agent config override run_config

07870c6

Added last case

2b4d3ac

hangfei removed the ready to pull [Status] This PR is ready to be importing back to Google label Oct 15, 2025

hangfei approved these changes Oct 15, 2025

View reviewed changes

hangfei added the ready to pull [Status] This PR is ready to be importing back to Google label Oct 15, 2025

Merge branch 'main' into bidi_agent_speech_config

af1bd27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Granular Per-Agent Speech Configuration #3170

Granular Per-Agent Speech Configuration #3170

qyuo commented Oct 14, 2025

Uh oh!

gemini-code-assist bot commented Oct 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

hangfei commented Oct 14, 2025

Uh oh!

hangfei commented Oct 14, 2025

Uh oh!

qyuo commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Granular Per-Agent Speech Configuration #3170

Are you sure you want to change the base?

Granular Per-Agent Speech Configuration #3170

Conversation

qyuo commented Oct 14, 2025

Uh oh!

gemini-code-assist bot commented Oct 14, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

hangfei commented Oct 14, 2025

Uh oh!

hangfei commented Oct 14, 2025

Uh oh!

qyuo commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants