Skip to content

Conversation

d4mr
Copy link
Member

@d4mr d4mr commented Oct 3, 2025

Summary by CodeRabbit

  • New Features

    • Separate runtimes for HTTP serving and background execution.
    • Coordinated startup and shutdown across services with timeouts.
    • Global metrics registry and executor metrics initialization.
  • Refactor

    • Reworked startup sequencing: build resources, start server, then workers.
    • Delegated asynchronous tasks to dedicated runtimes with tuned thread ratios.
  • Chores

    • Enhanced log messages to reflect multi-stage startup/shutdown lifecycle.

Copy link

coderabbitai bot commented Oct 3, 2025

Walkthrough

Refactors server startup to a synchronous main that orchestrates two Tokio runtimes (HTTP and executor). Moves initialization and workers to the executor runtime, starts HTTP server on a separate runtime, sets up global metrics, sequences startup, and adds coordinated shutdown across both runtimes with timeouts and revised logging.

Changes

Cohort / File(s) Summary
Runtime orchestration and separation
server/src/main.rs
Replaces async main with a sync entrypoint that creates two Tokio runtimes (http_runtime, executor_runtime) with a 1:2 thread ratio. Delegates HTTP server to http_runtime and initialization/queues to executor_runtime.
Startup sequencing
server/src/main.rs
Builds resources (Vault, IAW, KMS cache, signers, Redis, auth cache, queue manager) in executor_runtime, then initializes/starts HTTP server in http_runtime, then starts workers, and awaits Ctrl-C in executor_runtime.
Coordinated shutdown
server/src/main.rs
Implements ordered shutdown: stop HTTP server, stop workers, and gracefully shutdown both runtimes with timeouts.
Metrics initialization
server/src/main.rs
Initializes metric registry and executor metrics globally; wires metrics into runtimes.
Logging updates
server/src/main.rs
Adjusts log messages to reflect multi-runtime lifecycle and step-by-step startup/shutdown.
Public signature change
server/src/main.rs
Changes entrypoint from async fn main() -> anyhow::Result<()> to fn main() -> anyhow::Result<()>.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant M as Main (sync)
    participant ER as Executor Runtime
    participant HR as HTTP Runtime
    participant RES as Resources (Vault/IAW/KMS/Signers/Redis/Auth/Queues)
    participant HTTP as HTTP Server
    participant W as Queue Workers
    participant OS as OS Signal (Ctrl-C)

    rect rgba(230,240,255,0.6)
    note over M: Startup
    M->>M: Create ER (threads: 2x HTTP)
    M->>M: Create HR (threads: 1x)
    M->>ER: Spawn init task
    ER->>RES: Build clients, caches, signers, queue mgr
    ER-->>M: Resources ready
    M->>HR: Spawn HTTP init/start
    HR->>HTTP: Build and start server
    HR-->>M: HTTP ready
    M->>ER: Start workers
    ER->>W: Launch workers
    ER->>OS: Await Ctrl-C
    end

    rect rgba(255,235,230,0.6)
    note over M: Coordinated shutdown
    OS-->>ER: Ctrl-C received
    ER-->>M: Signal shutdown
    M->>HR: Stop HTTP server (graceful, timeout)
    HR->>HTTP: Shutdown
    M->>ER: Stop workers (graceful)
    ER->>W: Shutdown
    M->>HR: Shutdown runtime (timeout)
    M->>ER: Shutdown runtime (timeout)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly describes the core change of introducing distinct runtimes for HTTP and worker tasks, directly reflecting the main modifications in the PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pb/seperate-runtimes

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
server/src/main.rs (4)

37-42: Thread split can underutilize cores; compute executor threads as the leftover.

Current math underutilizes when cores % 3 != 0 and oversubscribes on single-core. Prefer allocating HTTP first, then assign the remainder to executors.

Apply:

-// Allocate 1/3 for HTTP, 2/3 for executors (maintaining 1:2 ratio), with minimum of 1 each
-let http_threads = (available_parallelism / 3).max(1);
-let executor_threads = (available_parallelism * 2 / 3).max(1);
+// Allocate ~1/3 for HTTP, assign the remainder to executors; ensure at least 1 each
+let http_threads = (available_parallelism / 3).max(1);
+let executor_threads = (available_parallelism.saturating_sub(http_threads)).max(1);

Optional: if available_parallelism == 1, consider a single runtime to avoid oversubscription.

Also applies to: 49-53


138-147: Global metrics init is fine; consider OnceCell to enforce single-init.

initialize_metrics overwrites a global RwLock each run. Prefer OnceCell to prevent accidental reinit in tests/multiple binaries.


167-187: Bind HTTP socket before starting workers to fail fast on port conflicts.

Currently, workers start before bind. If bind fails, you started background work unnecessarily. Bind first, then start workers.


197-203: Also handle SIGTERM for graceful shutdown (k8s/systemd).

Add SIGTERM alongside Ctrl+C with cfg(unix) guard.

-    executor_runtime.block_on(async {
-        if let Err(e) = tokio::signal::ctrl_c().await {
-            tracing::warn!("Failed to listen for Ctrl+C: {}", e);
-        }
-        tracing::info!("Shutdown signal received");
-    });
+    executor_runtime.block_on(async {
+        #[cfg(unix)]
+        {
+            let mut sigterm = tokio::signal::unix::signal(
+                tokio::signal::unix::SignalKind::terminate()
+            ).expect("Failed to install SIGTERM handler");
+            tokio::select! {
+                res = tokio::signal::ctrl_c() => {
+                    if let Err(e) = res {
+                        tracing::warn!("Failed to listen for Ctrl+C: {}", e);
+                    }
+                }
+                _ = sigterm.recv() => {}
+            }
+        }
+        #[cfg(not(unix))]
+        {
+            if let Err(e) = tokio::signal::ctrl_c().await {
+                tracing::warn!("Failed to listen for Ctrl+C: {}", e);
+            }
+        }
+        tracing::info!("Shutdown signal received");
+    });
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3774a4b and aa22897.

📒 Files selected for processing (1)
  • server/src/main.rs (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/main.rs (5)
server/src/http/server.rs (1)
  • new (47-109)
server/src/queue/manager.rs (1)
  • new (47-273)
core/src/signer.rs (1)
  • new (211-216)
thirdweb-core/src/iaw/mod.rs (1)
  • new (185-201)
executors/src/metrics.rs (1)
  • initialize_metrics (101-104)
🔇 Additional comments (5)
server/src/main.rs (5)

161-166: Workers startup LGTM.

Starting workers on the executor runtime and holding a shutdown handle is correct.


226-231: Runtime shutdown with timeouts: LGTM.

Orderly stop HTTP, then workers, then runtimes with bounded timeouts.


18-24: Log target "twmq" is valid
The crate twmq exists in the workspace, so the twmq=info filter will apply as intended.


15-31: Tokio feature flags confirmed: All relevant crates use features = ["full"], covering rt-multi-thread, signal, net, and time.


189-194: EngineServer::start is non-blocking: it uses tokio::spawn to run the serve future and returns immediately, so http_runtime.block_on won’t hang.

Comment on lines +96 to +103
let authorization_cache = EoaAuthorizationCache::new(
moka::future::Cache::builder()
.max_capacity(1024 * 1024 * 1024)
.time_to_live(Duration::from_secs(60 * 5))
.time_to_idle(Duration::from_secs(60))
.build(),
);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Authorization cache capacity set to 1,073,741,824 entries is unsafe (OOM risk).

moka::future::Cache max_capacity counts entries (not bytes). 1B entries will exhaust memory long before TTL/TTI evict.

Use a realistic entry cap or move to weight-based sizing:

-        let authorization_cache = EoaAuthorizationCache::new(
-            moka::future::Cache::builder()
-                .max_capacity(1024 * 1024 * 1024)
-                .time_to_live(Duration::from_secs(60 * 5))
-                .time_to_idle(Duration::from_secs(60))
-                .build(),
-        );
+        let authorization_cache = EoaAuthorizationCache::new(
+            moka::future::Cache::builder()
+                // Cap entries to a sane upper bound; tune via config if needed
+                .max_capacity(100_000)
+                .time_to_live(Duration::from_secs(5 * 60))
+                .time_to_idle(Duration::from_secs(60))
+                .build(),
+        );

If you intended capacity by size, switch to a weigher and weight-based policy (I can draft that).

🤖 Prompt for AI Agents
In server/src/main.rs around lines 96 to 103, the cache is created with
max_capacity(1024 * 1024 * 1024) which is interpreted as 1,073,741,824 entries
(unsafe and will OOM). Replace that hardcoded huge entry count with a realistic
entry limit (e.g., a few thousand or a configurable constant) OR switch to
weight-based sizing by adding a weigher that returns an approximate byte weight
per entry and using the cache builder's weight-based capacity API; keep the
TTL/TTI settings as-is and ensure the chosen limit or total-weight cap is
configurable via env/config so it can be tuned without code changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant