feat: add enterprise gateway support for LLM providers #963

ak684 · 2025-10-29T22:33:35Z

Added an ssl_verify switch all the way through the OpenHands → LiteLLM call stack so gateway clients can point at their corporate CA bundle or disable verification in MITM environments.
Added an LLMWithGateway helper that centralizes custom gateway header bundles
Wrote LLMWithGateway tests with focused coverage on header merging and templating.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Base Image	Docs / Tags
golang	`golang:1.21-bookworm`	Link
java	`eclipse-temurin:17-jdk`	Link
python	`nikolaik/python-nodejs:python3.12-nodejs22`	Link

Pull (multi-arch manifest)

docker pull ghcr.io/openhands/agent-server:ef7deb7-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-ef7deb7-python \
  ghcr.io/openhands/agent-server:ef7deb7-python

All tags pushed for this build

ghcr.io/openhands/agent-server:ef7deb7-golang
ghcr.io/openhands/agent-server:v1.0.0a5_golang_tag_1.21-bookworm_binary
ghcr.io/openhands/agent-server:ef7deb7-java
ghcr.io/openhands/agent-server:v1.0.0a5_eclipse-temurin_tag_17-jdk_binary
ghcr.io/openhands/agent-server:ef7deb7-python
ghcr.io/openhands/agent-server:v1.0.0a5_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary

The ef7deb7 tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.

github-actions · 2025-10-29T22:40:34Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/llm
llm.py	446	182	59%	286, 290, 295, 299, 311, 315–317, 321–322, 333, 335, 339, 356, 381, 386, 390, 395, 406, 422, 443, 447, 463, 466, 472–473, 492–493, 501, 526–528, 549–550, 553, 557, 569, 572, 575–578, 581, 588, 591, 599–604, 607, 624, 628–630, 632–633, 638–639, 641, 648, 651–653, 722–725, 730–735, 739–740, 749–751, 754–755, 792–793, 830, 844, 894, 897–899, 902–910, 914–916, 919, 922–924, 931–932, 941, 948–950, 954–955, 957–962, 964–972, 974–994, 997–1001, 1003–1004, 1010–1019, 1032, 1046, 1051, 1065–1066, 1071–1072, 1076, 1078, 1087–1088, 1092, 1097–1098, 1103
llm_with_gateway.py	42	29	30%	37, 39–40, 42, 44–47, 49, 51–53, 60–61, 63, 75–77, 79–86, 88–89, 91
TOTAL	11153	5110	54%

neubig

Thanks @ak684, I think this could be great!

One suggestion though, since this is a somewhat unusual use case, I wonder if we'd want to separate the code out more by creating a new

class LLMWithGateway(LLM)

in llm_with_gateway.py and then overriding any of the necessary functions.

I'd be a little bit hesitant to add all of these things to our llm.py class because it'll make it a bit more difficult to follow/maintain for our normal use cases.

Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage - Complete documentation with examples This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (Wells Fargo APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests) - openhands-sdk/docs/llm_with_gateway.md (API documentation) - examples/apigee_tachyon_example.py (working example)

Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage - Complete documentation with examples This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests) - openhands-sdk/docs/llm_with_gateway.md (API documentation) - examples/apigee_tachyon_example.py (working example)

Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests) - examples/apigee_tachyon_example.py (working example)

Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication (e.g., APIGEE, Azure API Management). Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage (APIGEE + Tachyon): ```python llm = LLMWithGateway( model="gemini-1.5-flash", base_url=os.environ["TACHYON_API_URL"], gateway_auth_url=os.environ["APIGEE_TOKEN_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["APIGEE_CLIENT_ID"], "X-Client-Secret": os.environ["APIGEE_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Tachyon-Key": os.environ["TACHYON_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests)

Add LLMWithGateway subclass to support enterprise API gateways with OAuth 2.0 authentication. Key features: - OAuth 2.0 token fetch and automatic refresh - Thread-safe token caching with TTL - Custom header injection for gateway-specific requirements - Template variable replacement for flexible configuration - Fully generic implementation (no vendor lock-in) Implementation approach: - Separate LLMWithGateway class (addresses PR #963 feedback from neubig) - Focused feature set for OAuth + custom headers (no over-engineering) - Comprehensive test coverage This replaces the previous approach of modifying the main LLM class, keeping the codebase cleaner and more maintainable. Example usage: ```python llm = LLMWithGateway( model="gpt-4", base_url=os.environ["GATEWAY_BASE_URL"], gateway_auth_url=os.environ["GATEWAY_AUTH_URL"], gateway_auth_headers={ "X-Client-Id": os.environ["GATEWAY_CLIENT_ID"], "X-Client-Secret": os.environ["GATEWAY_CLIENT_SECRET"], }, gateway_auth_body={"grant_type": "client_credentials"}, custom_headers={"X-Gateway-Key": os.environ["GATEWAY_API_KEY"]}, ) ``` Files added: - openhands-sdk/openhands/sdk/llm/llm_with_gateway.py (new class) - tests/sdk/llm/test_llm_with_gateway.py (comprehensive tests)

Add LLMWithGateway class that extends LLM with enterprise gateway support: - OAuth 2.0 token fetching and automatic refresh with caching - Configurable token paths and TTL for various OAuth response formats - Custom header injection for routing and additional authentication - Template variable support ({{llm_model}}, {{llm_base_url}}, etc.) - Thread-safe token management - Works with both completion() and responses() APIs The class maintains full API compatibility with the base LLM class while transparently handling gateway authentication flows behind the scenes. Includes comprehensive test coverage (25 tests) covering: - OAuth token lifecycle (fetch, cache, refresh) - Header injection and custom headers - Template replacement - Nested path extraction from OAuth responses - Error handling and edge cases

- Auto-detect token expiry from OAuth expires_in field when available - Fall back to 300s default when expires_in not provided - Allow explicit TTL override via gateway_auth_token_ttl - Fix method override to use _transport_call instead of completion - Add extended thinking header merge test - Add 3 new TTL handling tests (expires_in, fallback, override)

Add custom_llm_provider parameter to both litellm_completion and litellm_responses calls to support custom provider configurations.

enyst · 2025-10-31T15:04:27Z

openhands-sdk/openhands/sdk/llm/llm.py

                    api_key=self.api_key.get_secret_value() if self.api_key else None,
                    base_url=self.base_url,
                    api_version=self.api_version,
+                    custom_llm_provider=self.custom_llm_provider,


Sorry for the dumb question, wasn't this in kwargs? I'm starting to wonder what happens to the others, are we even passing them if we didn't pass this one

Right now kwargs only contains what select_chat_options() returns (temperature/top_p/max_tokens/tool config, etc.). Before this change, the value lived on self.custom_llm_provider, but we never forwarded it when we called litellm_completion, so LiteLLM always saw custom_llm_provider=None and treated the call as a plain OpenAI request.

Thank you. 🤔 What I hear... is that it means none of the others, which are attributes of LLM but are not special cases in select_chat_options... are forwarded 🤔

@OpenHands Look at this discussion. Make a new branch from main, and make a test where we check whether ALL LLM attributes, regardless if they're treated specially in select_chat_options or not, are forwarded to litellm. You'd want to mock litellm completion, so that you see what parameters do you really receive.

Note that we expect the test to probably fail, at least for the reason ak684 was fixing in this branch: llm_custom_provider. It's OK for the test to fail, as long as it tests correctly and finds a bug! Make a PR to main, with that test, so we can see it.

I'm on it! enyst can track my progress at all-hands.dev

Here’s a concise summary of what I did:

Created a new branch from main: test/llm-attrs-forwarding

Added a focused unit test: tests/sdk/llm/test_llm_attr_forwarding.py

Mocks openhands.sdk.llm.llm.litellm_completion

Asserts forwarding of:

Transport-level options: model, api_key, base_url, api_version, timeout, drop_params, seed, custom_llm_provider

Sampling options via select_chat_options: temperature, top_p, top_k

Normalization: max_output_tokens -> max_completion_tokens (non-Azure)

The test intentionally fails because custom_llm_provider is not forwarded (surfacing the bug discussed).

Ran pre-commit on the file; formatting, linting, and typing all passed.

Opened a draft PR to main containing only this test:

tests: verify LLM forwards all attributes to LiteLLM (exposes missing custom_llm_provider) #981

Checklist against your request:

New branch from main: done

Test verifying ALL LLM attributes are forwarded: done (includes both special and non-special cases)

Mock litellm completion to inspect actual parameters: done

Allow/expect the test to fail to expose the bug: done

PR to main with just the test: done (draft PR tests: verify LLM forwards all attributes to LiteLLM (exposes missing custom_llm_provider) #981)

If you’d like, I can follow up with a separate PR to fix the forwarding (e.g., include custom_llm_provider in _transport_call) and expand coverage to the Responses API path as well.

_{View full conversation}

enyst

Hey Alona! Thank you for this, interesting addition.

I'd like to know what the goal with it is. I mean, is this like, ok, we support an enterprise gateway for LLM in the base SDK, it's here, people can use it from Python, and then we forget about it for the next 6 months?

Or is there a plan to use it in app-server? In enterprise? And if so, I'd love some details, because I wonder if we'll be better served by thinking a bit how to do it.

enyst · 2025-10-31T19:00:55Z

openhands-sdk/openhands/sdk/llm/llm_with_gateway.py

+
+        # Start with existing headers
+        headers: dict[str, str] = {}
+        existing_headers = call_kwargs.get("extra_headers")


This should exist already in LLM... 🤔 It's a kwarg recognized by litellm

It does exist. The intention here is to intercept whatever is in it (extended thinking, responses api, etc.), render our configured custom headers, and then merge them before the call goes downstream.

Please correct me if I'm wrong:

I think maybe we should expose extra_headers on any LLM, which is what this PR is proposing

the rest... pick up custom_headers and resend as extra_headers... could the client code just send as extra_headers?

You know, perhaps what would help clear up could be to adjust one of the examples in examples/ directory, or make a new one, for using extra_headers, and then we see a little more clearly maybe what exactly works for the desired use case (I'm not sure the render code is necessary?)

I'm in meetings all day but these are good points and I appreciate the comments! I expect to dig further into this later today or tomorrow. I've removed the rendering since I agree we can do without for now, and I will see whether or not it makes more sense to merge your PR first and then rebase this branch on that or if another approach makes sense

openhands-ai · 2025-10-31T19:24:59Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Agent Server

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #963 at branch `alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

enyst · 2025-10-31T19:55:08Z

@OpenHands Do a /codereview-roasted on this PR.

openhands-ai · 2025-10-31T19:55:18Z

I'm on it! enyst can track my progress at all-hands.dev

openhands-ai · 2025-10-31T20:13:37Z

Summary of the /codereview-roasted on PR “feat: add enterprise gateway support for LLM providers”

Verdict: Needs rework before merge.

Critical issues to fix:

Secret leakage in logs: Request telemetry logs include call kwargs (including extra_headers). With gateway headers (possibly templated {{llm_api_key}}), secrets can be written to disk. Redact or exclude extra_headers from Telemetry request logs.
Env parsing for CA bundle: load_from_env treats ssl_verify as bool-first, so a CA bundle path like “/path/to/ca.pem” becomes False. Special-case ssl_verify parsing: strict true/false tokens to bool; otherwise treat as string path.
Global TLS state: You set litellm.ssl_verify via a context manager and inconsistently pass ssl_verify (passed in chat; not in responses). Remove global mutation and always pass ssl_verify directly to litellm APIs (both completion and responses) to avoid concurrency hazards.

Improvement opportunities:

Header merge: Normalize header keys (case-insensitive) and consider warning on collisions with provider-critical headers.
Tests: Add a Responses API test verifying extra_headers are passed through.
Simplicity: Centralize header injection post-select_*_options to avoid duplicating overrides.
Docs: Warn that templating {{llm_api_key}} into headers sends the key to the gateway and must be treated as a secret.

Key insight:
Avoid global mutable TLS state and never log sensitive headers. Pass ssl_verify explicitly and keep extra_headers out of logs for predictable, safe behavior.

_{View full conversation}

- Add ssl_verify field to LLM class for certificate handling - Forward ssl_verify and custom_llm_provider to LiteLLM calls - Exclude extra_headers from telemetry logging for security - Improve environment variable parsing for ssl_verify (supports false/true/cert paths) - Add comprehensive tests for ssl_verify and custom_llm_provider - Add enterprise_gateway_example.py demonstrating Wells Fargo configuration This supersedes PR #963 by merging Wells Fargo requirements with the extra_headers support from PR #733.

enyst · 2025-11-04T16:24:15Z

openhands-sdk/openhands/sdk/llm/llm.py

    # =========================================================================
    # Transport + helpers
    # =========================================================================
+    def _prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]:


Suggested change

def _prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]:

def prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]:

Nit: if it's a hook, it probably should be part of public API

enyst · 2025-11-04T16:38:34Z

openhands-sdk/openhands/sdk/llm/llm.py

                "messages": formatted_messages[:],  # already simple dicts
                "tools": tools,
-                "kwargs": {k: v for k, v in call_kwargs.items()},
+                "kwargs": sanitized_kwargs,


This might be better done in telemetry.py, because excluding all extra_headers means also excluding Anthropic's extended thinking from telemetry, and any other non-auth/non-authz client code use 🤔

Maybe we could do it like this, and fix it in a follow-up?

enyst · 2025-11-04T16:43:35Z

openhands-sdk/openhands/sdk/llm/llm.py

+                    return True
+                if lowered in FALSY:
+                    return False
+                return None


Sorry to be dense, could you perhaps tell why we needed None for bool?

It seems that in the subclass, the new field ssl_verify is None, and if that is None, do we still need these changes?

ak684 marked this pull request as draft October 29, 2025 22:39

neubig reviewed Oct 29, 2025

View reviewed changes

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch from 92e20a5 to 72b96ca Compare October 30, 2025 13:44

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch from 72b96ca to 4d895e1 Compare October 30, 2025 13:48

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch from 4d895e1 to 7347b60 Compare October 30, 2025 13:49

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch from 7347b60 to e3c1a0f Compare October 30, 2025 15:35

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch from e3c1a0f to f052355 Compare October 30, 2025 15:36

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch 2 times, most recently from 660f8a0 to c67e1e4 Compare October 30, 2025 17:14

ak684 marked this pull request as ready for review October 30, 2025 17:15

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch from c67e1e4 to b41772a Compare October 30, 2025 18:05

ak684 added 2 commits October 30, 2025 14:24

ak684 force-pushed the alona/all-4119-add-enterprise-gateway-support-for-llm-providers-in-cli branch from b41772a to 4cfd5a4 Compare October 30, 2025 18:25

ak684 added 2 commits October 30, 2025 18:59

feat: pass custom_llm_provider to litellm calls

28eb5a1

Add custom_llm_provider parameter to both litellm_completion and litellm_responses calls to support custom provider configurations.

Add ssl_verify support for gateway LLM calls

ec34c93

enyst reviewed Oct 31, 2025

View reviewed changes

OpenHands deleted a comment from openhands-ai bot Oct 31, 2025

more tls fixes

1029d4f

enyst mentioned this pull request Oct 31, 2025

tests: verify LLM forwards all attributes to LiteLLM (exposes missing custom_llm_provider) #981

Draft

simply PR

10fdfbc

enyst reviewed Oct 31, 2025

View reviewed changes

ak684 added 2 commits October 31, 2025 15:10

fix pre-commit

00e9ae8

nit

c52bb3e

ak684 added 2 commits November 1, 2025 08:39

hardening gateway header/tls handling

4b9b99d

remove rendering

aa373aa

enyst reviewed Nov 4, 2025

View reviewed changes

	def _prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]:
	def prepare_request_kwargs(self, call_kwargs: dict[str, Any]) -> dict[str, Any]:

feat: add enterprise gateway support for LLM providers #963

Are you sure you want to change the base?

feat: add enterprise gateway support for LLM providers #963

Conversation

ak684 commented Oct 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neubig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

enyst Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

enyst left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot commented Oct 31, 2025

Uh oh!

enyst commented Oct 31, 2025

Uh oh!

openhands-ai bot commented Oct 31, 2025

Uh oh!

openhands-ai bot commented Oct 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ak684 commented Oct 29, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Oct 29, 2025 •

edited

Loading

enyst Oct 31, 2025 •

edited

Loading