Skip to content

Conversation

@zgreathouse
Copy link
Contributor

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

  • Adds Hume TTS service
  • Adds Hume Python SDK (0.11.2) dependency for service

using the Python SDK and emits `TTSAudioRawFrame`s suitable for Pipecat transports.
Parameters
----------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


try:
# Instant mode is always enabled here (not user-configurable)
async for chunk in self._client.tts.synthesize_json_streaming(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the TTSService class, there's a chunk_size @property.

    @property
    def chunk_size(self) -> int:
        """Get the recommended chunk size for audio streaming.

        This property indicates how much audio we download (from TTS services
        that require chunking) before we start pushing the first audio
        frame. This will make sure we download the rest of the audio while audio
        is being played without causing audio glitches (specially at the
        beginning). Of course, this will also depend on how fast the TTS service
        generates bytes.

        Returns:
            The recommended chunk size in bytes.
        """
        CHUNK_SECONDS = 0.5
        return int(self.sample_rate * CHUNK_SECONDS * 2)  # 2 bytes/sample

We've found this work well to avoid audio glitches in playback. It's helpful to use the property so we can uniformly adjust all HTTP based services.

@markbackman
Copy link
Contributor

Generally looks good!

Can you also add an example, following the 07-interruptible pattern? Also, add this to the evals list here:
https://github.com/pipecat-ai/pipecat/blob/main/scripts/evals/run-release-evals.py

We use the foundational examples for evals and they're also helpful discovery points for developers trying out services.

If you haven't done so already, make sure to lint your code. You can install the pre-commit hook using uv run pre-commit install from the base of the repo.

Last two things:

@ivaaan
Copy link
Contributor

ivaaan commented Oct 1, 2025

@zgreathouse I've addressed the feedback from @markbackman and created this PR: zgreathouse#1

We need to troubleshoot the example as I see all the responses in terminal, but not in Pipecat UI in browser. Once that's fixed, we should be good to go (I hope)

@markbackman
Copy link
Contributor

markbackman commented Oct 1, 2025

Can you please rebase the PR to resolve the conflicts?

We need to troubleshoot the example as I see all the responses in terminal, but not in Pipecat UI in browser. Once that's fixed, we should be good to go (I hope)

To get text to appear in the console, you need to add an RTVIProcessor and observer. You can see that in use in the quickstart bot file: https://github.com/pipecat-ai/pipecat/blob/main/examples/quickstart/bot.py#L84-L105

We haven't included RTVI for these examples to keep them simple, so this is probably a non-issue. I'll review the PR shortly. Thanks for the quick fixes!

pyproject.toml Outdated
webrtc = [ "aiortc~=1.11.0", "opencv-python~=4.11.0.86" ]
websocket = [ "websockets>=13.1,<15.0", "fastapi>=0.115.6,<0.117.0" ]
whisper = [ "faster-whisper~=1.1.1" ]
fastapi = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this addition? I think it can be removed.

"""
return self._sample_rate

@property
Copy link
Contributor

@markbackman markbackman Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this removed intentionally? This should remain.

audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(),
audio_out_sample_rate=HUME_SAMPLE_RATE,
Copy link
Contributor

@markbackman markbackman Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove. Sample rates should be set in the PipelineParams, not in individual services.

pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
Copy link
Contributor

@markbackman markbackman Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the sample rate here.

Suggested change
enable_usage_metrics=True,
enable_usage_metrics=True,
audio_out_sample_rate=HUME_SAMPLE_RATE,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will set the sample rate to HUME_SAMPLE_RATE for all processors that output audio.

from loguru import logger

from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import StartFrame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove. Unused.

yield TTSStoppedFrame()


__all__ = ["HumeTTSService"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this.

Suggested change
__all__ = ["HumeTTSService"]

The pattern is to import as:

from pipecat.services.hume.tts import HumeTTSService

)

super().__init__(
pause_frame_processing=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove. This should be: super().__init__(sample_rate=sample_rate, **kwargs)

Suggested change
pause_frame_processing=True,

# Request raw PCM chunks in the streaming JSON
pcm_fmt = FormatPcm(type="pcm")

measuring_ttfb = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks around measuring_ttfb aren't needed. You can remove this variable and the if checks.


"""Hume Text-to-Speech service implementation."""

from __future__ import annotations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Suggested change
from __future__ import annotations

pcm_bytes = base64.b64decode(audio_b64)
self._audio_bytes += pcm_bytes

# Send the first audio chunk immediately to avoid client-side delays.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to remove lines 208-216. This is causing duplicate initial audio to be spoken. In removing it, it solves the duplicate issue I was seeing running this file verbatim.

Copy link
Contributor

@markbackman markbackman Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the corresponding first_audio_sent variable on line 194.

logger.exception(f"{self} error generating TTS: {e}")
yield ErrorFrame(error=str(e))
finally:
# Yield any remaining audio
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to remove this too (lines 230-231). Pipecat takes care of sending audio. You just need to yield TTSAudioRawFrames as you do above.

except ModuleNotFoundError as e: # pragma: no cover - import-time guidance
logger.error(f"Exception: {e}")
logger.error("In order to use Hume, you need to `pip install pipecat-ai[hume]`.")
raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be:

Suggested change
raise
raise Exception(f"Missing module: {e}")

from pipecat.services.hume.tts import HUME_SAMPLE_RATE, HumeTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update import path:

Suggested change
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams

from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.network.fastapi_websocket import FastAPIWebsocketParams
from pipecat.transports.services.daily import DailyParams
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update import path:

Suggested change
from pipecat.transports.services.daily import DailyParams
from pipecat.transports.daily.transport import DailyParams

},
]

context = OpenAILLMContext(messages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about this. We just changed the pattern for this. To avoid a deprecation warning, use:

    context = LLMContext(messages)
    context_aggregator = LLMContextAggregatorPair(context)

Import paths are:

from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair

logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([context_aggregator.user().get_context_frame()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the LLMContext change above, you need:

Suggested change
await task.queue_frames([context_aggregator.user().get_context_frame()])
await task.queue_frames([LLMRunFrame()])

LLMRunFrame is imported from:

from pipecat.frames.frames import LLMRunFrame


async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
runner_args.transport = "webrtc"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove.

You can run foundational examples using the Pipecat development runner, which takes command line args:

  • SmallWebRTCTransport: uv run 07ad-interruptible-hume.py
  • DailyTransport: uv run 07ad-interruptible-hume.py --transport daily
  • SmallWebRTC: uv run 07ad-interruptible-hume.py --transport twilio --proxy YOUR_NGROK_URL

Let's stick to the pattern in this example, so that using these are uniform.

Suggested change
runner_args.transport = "webrtc"

@@ -0,0 +1,124 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

07ad has been taken. Let's rename to 07ae.

Copy link
Contributor

@markbackman markbackman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you 🙌

All that remains is to rebase on the latest main and add a changelog entry. Also, make sure the code is linted. You can install the pre-commit hook (uv run pre-commit install) or run the ./scripts/fix-ruff.sh script to clean up.

@codecov
Copy link

codecov bot commented Oct 2, 2025

Codecov Report

❌ Patch coverage is 0% with 75 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/pipecat/services/hume/tts.py 0.00% 75 Missing ⚠️
Files with missing lines Coverage Δ
src/pipecat/services/hume/tts.py 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@markbackman markbackman merged commit ad2adb0 into pipecat-ai:main Oct 2, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants