Skip to content

Conversation

AayushSaini101
Copy link
Contributor

closes : #93

Copy link
Collaborator

@af-md af-md left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you push again with the deleted .idea folder?

@AayushSaini101
Copy link
Contributor Author

Can you push again with the deleted .idea folder?

Done thanks : )

@AayushSaini101 AayushSaini101 requested a review from af-md September 1, 2025 11:24
Copy link
Collaborator

@af-md af-md left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AayushSaini101 We also need to cover retries to DB interruptions in steps and other DBOS operations too

@maxdml
Copy link
Collaborator

maxdml commented Sep 1, 2025

@AayushSaini101 Thanks for the contribution! As @af-md mentioned, what #93 means to fix is resilience flacky DB connections when performing DBOS durability features. Here's an example:

  • A workflow runs and succeeds.
  • The code in workflow.go, responsible for recording the output of the workflow, fails due to a connection error in updateWorkflowOutcome.

What we need is being able to detect that the error falls into the category of (retriable DB errors) and do the retry.

To do this we'd first see if the error is of type pgconn.PgError and if it is a connection error code (here's a good list:

"08006"  // connection_failure - Connection lost during operation
"08003"  // connection_does_not_exist - Stale/terminated connection
"57P01"  // admin_shutdown - Server is shutting down
"57P03"  // cannot_connect_now - Server is starting up
"08001"  // sqlclient_unable_to_establish_sqlconnection - Can't connect
"08004"  // sqlserver_rejected_establishment_of_sqlconnection - Rejected (too many connections)

For this enhancement, it would be worth exploring this library: https://github.com/avast/retry-go. Specifically if we could wrap the retry logic inside system_database.go that'd be great.

We'll also need a good amount of testing, which we can do using https://github.com/Shopify/toxiproxy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

System DB retries
3 participants