-
Notifications
You must be signed in to change notification settings - Fork 38
feat: add enterprise gateway support for LLM providers #963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add enterprise gateway support for LLM providers #963
Conversation
Coverage Report •
|
|||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ak684, I think this could be great!
One suggestion though, since this is a somewhat unusual use case, I wonder if we'd want to separate the code out more by creating a new
class LLMWithGateway(LLM)
in llm_with_gateway.py and then overriding any of the necessary functions.
I'd be a little bit hesitant to add all of these things to our llm.py class because it'll make it a bit more difficult to follow/maintain for our normal use cases.
Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage - Complete documentation with examples This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (Wells Fargo APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests) - openhands-sdk/docs/llm_with_gateway.md (API documentation) - examples/apigee_tachyon_example.py (working example)
92e20a5 to
72b96ca
Compare
Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage - Complete documentation with examples This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests) - openhands-sdk/docs/llm_with_gateway.md (API documentation) - examples/apigee_tachyon_example.py (working example)
72b96ca to
4d895e1
Compare
Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage - Complete documentation with examples This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests) - openhands-sdk/docs/llm_with_gateway.md (API documentation) - examples/apigee_tachyon_example.py (working example)
4d895e1 to
7347b60
Compare
Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests) - examples/apigee_tachyon_example.py (working example)
7347b60 to
e3c1a0f
Compare
Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests)
e3c1a0f to
f052355
Compare
Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication. Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage: ```python llm = LLMWithGateway( model="gpt-4", base_url=os.environ["GATEWAY_BASE_URL"], gateway_auth_url=os.environ["GATEWAY_AUTH_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["GATEWAY_CLIENT_ID"], "X-Client-Secret": os.environ["GATEWAY_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Gateway-Key": os.environ["GATEWAY_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests)
660f8a0 to
c67e1e4
Compare
c67e1e4 to
b41772a
Compare
Add LLMWithGateway class that extends LLM with enterprise gateway support:
- OAuth 2.0 token fetching and automatic refresh with caching
- Configurable token paths and TTL for various OAuth response formats
- Custom header injection for routing and additional authentication
- Template variable support ({{llm_model}}, {{llm_base_url}}, etc.)
- Thread-safe token management
- Works with both completion() and responses() APIs
The class maintains full API compatibility with the base LLM class while
transparently handling gateway authentication flows behind the scenes.
Includes comprehensive test coverage (25 tests) covering:
- OAuth token lifecycle (fetch, cache, refresh)
- Header injection and custom headers
- Template replacement
- Nested path extraction from OAuth responses
- Error handling and edge cases
- Auto-detect token expiry from OAuth expires_in field when available - Fall back to 300s default when expires_in not provided - Allow explicit TTL override via gateway_auth_token_ttl - Fix method override to use _transport_call instead of completion - Add extended thinking header merge test - Add 3 new TTL handling tests (expires_in, fallback, override)
b41772a to
4cfd5a4
Compare
Add custom_llm_provider parameter to both litellm_completion and litellm_responses calls to support custom provider configurations.
| api_key=self.api_key.get_secret_value() if self.api_key else None, | ||
| base_url=self.base_url, | ||
| api_version=self.api_version, | ||
| custom_llm_provider=self.custom_llm_provider, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the dumb question, wasn't this in kwargs? I'm starting to wonder what happens to the others, are we even passing them if we didn't pass this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now kwargs only contains what select_chat_options() returns (temperature/top_p/max_tokens/tool config, etc.). Before this change, the value lived on self.custom_llm_provider, but we never forwarded it when we called litellm_completion, so LiteLLM always saw custom_llm_provider=None and treated the call as a plain OpenAI request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. 🤔 What I hear... is that it means none of the others, which are attributes of LLM but are not special cases in select_chat_options... are forwarded 🤔
@OpenHands Look at this discussion. Make a new branch from main, and make a test where we check whether ALL LLM attributes, regardless if they're treated specially in select_chat_options or not, are forwarded to litellm. You'd want to mock litellm completion, so that you see what parameters do you really receive.
Note that we expect the test to probably fail, at least for the reason ak684 was fixing in this branch: llm_custom_provider. It's OK for the test to fail, as long as it tests correctly and finds a bug! Make a PR to main, with that test, so we can see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on it! enyst can track my progress at all-hands.dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here’s a concise summary of what I did:
- Created a new branch from main: test/llm-attrs-forwarding
- Added a focused unit test: tests/sdk/llm/test_llm_attr_forwarding.py
- Mocks openhands.sdk.llm.llm.litellm_completion
- Asserts forwarding of:
- Transport-level options: model, api_key, base_url, api_version, timeout, drop_params, seed, custom_llm_provider
- Sampling options via select_chat_options: temperature, top_p, top_k
- Normalization: max_output_tokens -> max_completion_tokens (non-Azure)
- The test intentionally fails because custom_llm_provider is not forwarded (surfacing the bug discussed).
- Ran pre-commit on the file; formatting, linting, and typing all passed.
- Opened a draft PR to main containing only this test:
Checklist against your request:
- New branch from main: done
- Test verifying ALL LLM attributes are forwarded: done (includes both special and non-special cases)
- Mock litellm completion to inspect actual parameters: done
- Allow/expect the test to fail to expose the bug: done
- PR to main with just the test: done (draft PR tests: verify LLM forwards all attributes to LiteLLM (exposes missing custom_llm_provider) #981)
If you’d like, I can follow up with a separate PR to fix the forwarding (e.g., include custom_llm_provider in _transport_call) and expand coverage to the Responses API path as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Alona! Thank you for this, interesting addition.
I'd like to know what the goal with it is. I mean, is this like, ok, we support an enterprise gateway for LLM in the base SDK, it's here, people can use it from Python, and then we forget about it for the next 6 months?
Or is there a plan to use it in app-server? In enterprise? And if so, I'd love some details, because I wonder if we'll be better served by thinking a bit how to do it.
|
|
||
| # Start with existing headers | ||
| headers: dict[str, str] = {} | ||
| existing_headers = call_kwargs.get("extra_headers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should exist already in LLM... 🤔 It's a kwarg recognized by litellm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does exist. The intention here is to intercept whatever is in it (extended thinking, responses api, etc.), render our configured custom headers, and then merge them before the call goes downstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please correct me if I'm wrong:
- I think maybe we should expose
extra_headerson any LLM, which is what this PR is proposing - the rest... pick up
custom_headersand resend asextra_headers... could the client code just send asextra_headers?
You know, perhaps what would help clear up could be to adjust one of the examples in examples/ directory, or make a new one, for using extra_headers, and then we see a little more clearly maybe what exactly works for the desired use case (I'm not sure the render code is necessary?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in meetings all day but these are good points and I appreciate the comments! I expect to dig further into this later today or tomorrow. I've removed the rendering since I agree we can do without for now, and I will see whether or not it makes more sense to merge your PR first and then rebase this branch on that or if another approach makes sense
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
|
@OpenHands Do a /codereview-roasted on this PR. |
|
I'm on it! enyst can track my progress at all-hands.dev |
|
Summary of the /codereview-roasted on PR “feat: add enterprise gateway support for LLM providers” Verdict: Needs rework before merge. Critical issues to fix:
Improvement opportunities:
Key insight: |
- Add ssl_verify field to LLM class for certificate handling - Forward ssl_verify and custom_llm_provider to LiteLLM calls - Exclude extra_headers from telemetry logging for security - Improve environment variable parsing for ssl_verify (supports false/true/cert paths) - Add comprehensive tests for ssl_verify and custom_llm_provider - Add enterprise_gateway_example.py demonstrating Wells Fargo configuration This supersedes PR #963 by merging Wells Fargo requirements with the extra_headers support from PR #733.
| # ========================================================================= | ||
| # Transport + helpers | ||
| # ========================================================================= | ||
| def _prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def _prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]: | |
| def prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]: |
Nit: if it's a hook, it probably should be part of public API
| "messages": formatted_messages[:], # already simple dicts | ||
| "tools": tools, | ||
| "kwargs": {k: v for k, v in call_kwargs.items()}, | ||
| "kwargs": sanitized_kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be better done in telemetry.py, because excluding all extra_headers means also excluding Anthropic's extended thinking from telemetry, and any other non-auth/non-authz client code use 🤔
Maybe we could do it like this, and fix it in a follow-up?
| return True | ||
| if lowered in FALSY: | ||
| return False | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be dense, could you perhaps tell why we needed None for bool?
It seems that in the subclass, the new field ssl_verify is None, and if that is None, do we still need these changes?
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
golang:1.21-bookwormeclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22Pull (multi-arch manifest)
Run
All tags pushed for this build
The
ef7deb7tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.