Add parallel LLM racing implementation to 45-llm-hedge.py #2593
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🚀 Parallel LLM Racing Implementation
This PR implements a parallel LLM racing system in
45-llm-hedge.py
that allows two OpenAI LLM instances to compete for the fastest response, with only the winner's frames being passed through to the TTS service.✨ Features Added
🏁 LLMRaceProcessor
_winning_llm_name
,_response_started
)LLMTextFrame
is allowed through🔄 ParallelPipeline Architecture
📊 Filtered Debug Logging
DebugLogObserver
with frame type filteringFrameEndpoint.DESTINATION
🔗 Pipeline Flow
🎯 Key Implementation Details
context_aggregator.user()
to ensure consistencysuper().process_frame()
call to handle system frames likeStartFrame
🧪 Testing
📝 Usage
The system automatically races two OpenAI LLM instances on every user input:
🔧 Technical Notes
ParallelPipeline
for proper frame distributionLLMRaceProcessor
handles coordination between competing LLMs