Skip to content

Conversation

djeebus
Copy link
Contributor

@djeebus djeebus commented Oct 16, 2025

This allows us to test and standardize how we start, monitor, cancel, and clean up our goroutines. This PR only adds it to orchestrator, but we should add this to api, client-proxy, and the rest as well.


Note

Introduces a shared supervisor for task lifecycle and refactors orchestrator to use it for starting, monitoring, and cleanup; removes legacy goroutine/closer logic and tweaks network pool Close.

  • Supervisor (new package packages/shared/pkg/supervisor):
    • Adds Runner with AddTask, Run, and Close for task lifecycle management.
    • Handles OS signals, cancels tasks via context, aggregates exit/errors, and runs cleanups in reverse order.
    • Includes TaskExitedError and tests covering cancellation and early task exit.
  • Orchestrator refactor (packages/orchestrator/main.go):
    • Replaces custom errgroup/signal/closers/startService with supervisor.Runner tasks for all services (sandbox proxy, device pool, network pool, hyperloop server, cmux, http, grpc, etc.).
    • Adds cleanupLogger helper and registers loggers/clients/batchers/observers as Cleanup tasks.
    • Adjusts shutdown: runner.Run(ctx) then runner.Close(ctx); removes serviceDoneError, closer type, and related logic.
  • Network pool (internal/sandbox/network/pool.go):
    • Removes info log on Close; functional behavior unchanged.

Written by Cursor Bugbot for commit 8ecda4d. This will update automatically on new commits. Configure here.

cursor[bot]

This comment was marked as outdated.

closed bool
doneCh chan taskResult
wg sync.WaitGroup
cancelFunc context.CancelFunc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of having this context on the struct. Also, the cancelation is incorrect and differs from the initial implementation drastically

func (s *Runner) Close(ctx context.Context) error {
// Idempotent: if Run already performed cleanup, nothing to do
s.mu.Lock()
if s.cleanupsRun {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be done using atomic boolean

mu sync.Mutex
started bool
closed bool
doneCh chan taskResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about using the setonce instead? It could then handle multiple variables on this struct

tasks []Task
logger *zap.Logger

mu sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we even need the mutex?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants