Add parallel LLM racing implementation to 45-llm-hedge.py #2593

jamsea · 2025-09-05T09:07:05Z

🚀 Parallel LLM Racing Implementation

This PR implements a parallel LLM racing system in 45-llm-hedge.py that allows two OpenAI LLM instances to compete for the fastest response, with only the winner's frames being passed through to the TTS service.

✨ Features Added

🏁 LLMRaceProcessor

Custom frame processor that manages racing between two LLMs
Shared state coordination using class variables (_winning_llm_name, _response_started)
First response wins: Only the first LLM to generate LLMTextFrame is allowed through
Frame dropping: All subsequent frames from the losing LLM are discarded
Per-instance identification: Each processor knows which LLM it represents ("LLM1", "LLM2")

🔄 ParallelPipeline Architecture

parallel_llms = ParallelPipeline(
    [llm1, race_processor1],  # Branch 1: OpenAI LLM → Race Processor
    [llm2, race_processor2],  # Branch 2: OpenAI LLM → Race Processor  
)

📊 Filtered Debug Logging

Built-in Pipecat DebugLogObserver with frame type filtering
Only logs LLM frames going to TTS using FrameEndpoint.DESTINATION
Clean, focused logging without noise from other pipeline components

🔗 Pipeline Flow

transport.input() → stt → context_aggregator.user() → ParallelPipeline → tts → transport.output()
                                                          ↓
                                                    [llm1 → race1]
                                                    [llm2 → race2]

🎯 Key Implementation Details

Shared Context: Both LLMs process frames from the same context_aggregator.user() to ensure consistency
Race Logic: Uses shared class variables to coordinate state between processor instances
Frame Lifecycle: Proper super().process_frame() call to handle system frames like StartFrame
Performance: Fastest LLM response wins, slower responses are dropped to minimize latency

🧪 Testing

✅ Pipeline architecture properly created with two LLM branches
✅ Component linking correctly established
✅ Client connection via WebRTC transport
✅ Audio processing with VAD and Deepgram STT
✅ Race state management between processor instances

📝 Usage

The system automatically races two OpenAI LLM instances on every user input:

First LLM to respond wins the race
Losing LLM's frames are dropped
Logs show race results with 🏆 winner, ✅ continuation, and ❌ dropped frames

🔧 Technical Notes

Uses Pipecat's built-in ParallelPipeline for proper frame distribution
Custom LLMRaceProcessor handles coordination between competing LLMs
Maintains backward compatibility with existing pipeline structure
Follows Pipecat frame processing patterns and lifecycle management

codecov · 2025-09-05T09:09:00Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
see 86 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jamsea added 5 commits September 5, 2025 16:13

example: hedge llms

c439d79

Fix lint

604b710

Save changes

6a11707

This seems to work

4966ca3

Clean up code

f9bfadb

Clean up imports

5959637

jamsea self-assigned this Sep 5, 2025

jamsea requested a review from markbackman September 5, 2025 09:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add parallel LLM racing implementation to 45-llm-hedge.py #2593

Add parallel LLM racing implementation to 45-llm-hedge.py #2593

Uh oh!

jamsea commented Sep 5, 2025

Uh oh!

codecov bot commented Sep 5, 2025

Uh oh!

Uh oh!

Add parallel LLM racing implementation to 45-llm-hedge.py #2593

Are you sure you want to change the base?

Add parallel LLM racing implementation to 45-llm-hedge.py #2593

Uh oh!

Conversation

jamsea commented Sep 5, 2025

🚀 Parallel LLM Racing Implementation

✨ Features Added

🏁 LLMRaceProcessor

🔄 ParallelPipeline Architecture

📊 Filtered Debug Logging

🔗 Pipeline Flow

🎯 Key Implementation Details

🧪 Testing

📝 Usage

🔧 Technical Notes

Uh oh!

codecov bot commented Sep 5, 2025

Codecov Report

Uh oh!

Uh oh!