contenox/runtime: GenAI Orchestration Runtime

contenox/runtime is an open-source runtime for orchestrating generative AI workflows. It treats AI workflows as state machines, enabling:

✅ Declarative workflow definition ✅ Built-in state management ✅ Vendor-agnostic execution ✅ Multi-backend orchestration ✅ Observability with passion ✅ Made with Go for intensive load ✅ Build agentic capabilities via hooks ✅ Drop-in for OpenAI chatcompletion API

⚡ Get Started in 1-3 Minutes

This single command will start all necessary services, configure the backend, and download the initial models.

Prerequisites

Docker and Docker Compose
curl and jq

Run the Bootstrap Script

# Clone the repository
git clone https://github.com/contenox/runtime.git
cd runtime

# Configure the systems fallback models
export EMBED_MODEL=nomic-embed-text:latest
export EMBED_PROVIDER=ollama
export EMBED_MODEL_CONTEXT_LENGTH=2048
export TASK_MODEL=phi3:3.8b
export TASK_MODEL_CONTEXT_LENGTH=2048
export TASK_PROVIDER=ollama
export CHAT_MODEL=phi3:3.8b
export CHAT_MODEL_CONTEXT_LENGTH=2048
export CHAT_PROVIDER=ollama
export OLLAMA_BACKEND_URL="http://ollama:11434"
# or any other like: export OLLAMA_BACKEND_URL="http://host.docker.internal:11434"
# to use OLLAMA_BACKEND_URL with host.docker.internal
# remember sudo systemctl edit ollama.service -> Environment="OLLAMA_HOST=172.17.0.1" or 0.0.0.0

# Start the container services
echo "Starting services with 'docker compose up -d'..."
docker compose up -d
echo "Services are starting up."

# Configure the runtime with your model preferences
# the bootstraping script works only for ollama models/backends
# for to use other providers refer to the API-Spec.
./scripts/bootstrap.sh $EMBED_MODEL $TASK_MODEL $CHAT_MODEL
# setup a demo OpenAI chat-completion and model endpoint
./scripts/openai-demo.sh $CHAT_MODEL demo
# this will setup the following endpoints:
# - http://localhost:8081/openai/demo/v1/chat/completions
# - http://localhost:8081/openai/demo/v1/models

Once the script finishes, the environment is fully configured and ready to use.

Try It Out: Execute a Prompt

After the bootstrap is complete, test the setup by executing a simple prompt:

curl -X POST http://localhost:8081/execute \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Explain quantum computing in simple terms"}'

Next Steps: Create a Workflow

Save the following as qa.json:

{
  "input": "What's the best way to optimize database queries?",
  "inputType": "string",
  "chain": {
    "id": "smart-query-assistant",
    "description": "Handles technical questions",
    "tasks": [
      {
        "id": "generate_response",
        "description": "Generate final answer",
        "handler": "raw_string",
        "systemInstruction": "You're a senior engineer. Provide concise, professional answers to technical questions.",
        "transition": {
          "branches": [
            { "operator": "default", "goto": "end" }
          ]
        }
      }
    ]
  }
}

Execute the workflow:

curl -X POST http://localhost:8081/tasks \
  -H "Content-Type: application/json" \
  -d @qa.json

All runtime activity is captured in structured logs:

docker logs contenox-runtime-kernel

✨ Key Features

State Machine Engine

Conditional Branching: Route execution based on LLM outputs
Built-in Handlers:
- condition_key: Validate and route responses
- parse_number: Extract numerical values
- parse_range: Handle score ranges
- raw_string: Standard text generation
- embedding: Embedding generation
- model_execution: Model execution on a chat history
- hook: Calls a user-defined hook pointing to an external service
Context Preservation: Automatic input/output passing between steps
Multi-Model Support: Define preferred models for each task chain
Retry and Timeout: Configure task-level retries and timeouts for robust workflows

Multi-Provider Support

Define preferred model provider and backend resolution policy directly within task chains. This allows for seamless, dynamic orchestration across various LLM providers.

Architecture Overview

graph TD
    subgraph "User Space"
        U[User / Client Application]
    end

    subgraph "contenox/runtime"
        API[API Layer]
        OE["Orchestration Engine <br/> Task Execution <br/> & State Management"]
        CONN["Connectors <br/> Model Resolver <br/> & Hook Client"]
    end

    subgraph "External Services"
        LLM[LLM Backends <br/> Ollama, OpenAI, vLLM, etc.]
        HOOK[External Tools and APIs <br/> Custom Hooks]
    end

    %% --- Data Flow ---
    U -- API Requests --> API
    API -- Triggers Task Chain --> OE
    OE -- Executes via --> CONN
    CONN -- Routes to LLMs --> LLM
    CONN -- Calls External Hooks --> HOOK

    LLM -- LLM Responses --> CONN
    HOOK -- Hook Responses --> CONN
    CONN -- Results --> OE
    OE -- Returns Final Output --> API
    API -- API Responses --> U

Unified Interface: Consistent API across providers
Automatic Sync: Models stay consistent across backends
Affinity Group Management: Map models to backends for performance tiering and routing strategies
Backend Resolver: Distribute requests to backends based on resolution policies

🧩 Extensibility

Custom Hooks

Hooks are external servers that can be called from within task chains when registered. They allow interaction with systems and data outside of the runtime and task chains themselves. 🔗 See Hook Documentation

📘 API Documentation

The full API surface is thoroughly documented and defined in the OpenAPI format, making it easy to integrate with other tools. You can find more details here:

🔗 API Reference Documentation
🔗 View OpenAPI Spec (YAML)

The API-Tests are available for additional context.

Name		Name	Last commit message	Last commit date
Latest commit History 219 Commits
.github/workflows		.github/workflows
affinitygroupservice		affinitygroupservice
apitests		apitests
backendservice		backendservice
chatservice		chatservice
cmd/runtime-api		cmd/runtime-api
docs		docs
downloadservice		downloadservice
embedservice		embedservice
execservice		execservice
hookproviderservice		hookproviderservice
internal		internal
libbus		libbus
libcipher		libcipher
libdbexec		libdbexec
libkvstore		libkvstore
libroutine		libroutine
libtracker		libtracker
modelservice		modelservice
playground		playground
providerservice		providerservice
runtimesdk		runtimesdk
runtimetypes		runtimetypes
scripts		scripts
stateservice		stateservice
statetype		statetype
taskchainservice		taskchainservice
taskengine		taskengine
tools		tools
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
api_spec_generation.md		api_spec_generation.md
compose.local.yaml		compose.local.yaml
compose.yaml		compose.yaml
go.mod		go.mod
go.sum		go.sum
pyrightconfig.json		pyrightconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

contenox/runtime: GenAI Orchestration Runtime

⚡ Get Started in 1-3 Minutes

Prerequisites

Run the Bootstrap Script

Try It Out: Execute a Prompt

Next Steps: Create a Workflow

✨ Key Features

State Machine Engine

Multi-Provider Support

Architecture Overview

🧩 Extensibility

Custom Hooks

📘 API Documentation

About

Uh oh!

Releases

Uh oh!

Languages

License

contenox/runtime

Folders and files

Latest commit

History

Repository files navigation

contenox/runtime: GenAI Orchestration Runtime

⚡ Get Started in 1-3 Minutes

Prerequisites

Run the Bootstrap Script

Try It Out: Execute a Prompt

Next Steps: Create a Workflow

✨ Key Features

State Machine Engine

Multi-Provider Support

Architecture Overview

🧩 Extensibility

Custom Hooks

📘 API Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages