Skip to content

Conversation

@bonnefoa
Copy link
Collaborator

@bonnefoa bonnefoa commented Nov 6, 2025

What does this PR do?

Add sent/write/flush/replay lsn delay metrics from pg_stat_replication.

Motivation

pg_stat_replication provides metrics on the last sent/write/flush/replay WAL location by a standby server.

sent delay doesn't depend on a feedback message from the standby since it tracks the sent WAL through the connection. This can be used to gauge how fast and how late a standby is when catching up

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

queries.append(QUERY_PG_REPLICATION_SLOTS)
queries.append(QUERY_PG_REPLICATION_STATS_METRICS)

P1 Badge Guard replication stats query on Aurora without logical WAL

The new QUERY_PG_REPLICATION_STATS_METRICS query now calls pg_current_wal_lsn() for every row, but it is still appended unconditionally for all Postgres ≥10 environments. On Aurora instances where wal_level is not set to logical, calling pg_current_wal_lsn() raises an error (this is why the control checkpoint metrics are skipped under the same condition a few lines above). Without a similar guard here, the check will start failing on default Aurora setups. Consider skipping this query when self.is_aurora and self.wal_level != 'logical' or using a function that is allowed on Aurora.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.31%. Comparing base (5f145d8) to head (c6437d1).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bonnefoa bonnefoa force-pushed the bonnefoa/pg-walsender-lsn-delay branch from d26b527 to 8dea5ff Compare November 6, 2025 09:28
@github-actions
Copy link

github-actions bot commented Nov 6, 2025

⚠️ Major version bump
The changelog type changed or removed was used in this Pull Request, so the next release will bump major version. Please make sure this is a breaking change, or use the fixed or added type instead.

pg_stat_replication provides metrics on the last sent/write/flush/replay
WAL location by a standby server.

sent delay doesn't depend on a feedback message from the standby since
it tracks the sent WAL through the connection. This can be used to
gauge how fast and how late a standby is when catching up
@bonnefoa bonnefoa force-pushed the bonnefoa/pg-walsender-lsn-delay branch from 8dea5ff to 1ece735 Compare November 6, 2025 09:30
Only run checkpoint and check metrics on the primary. This will remove
the possible uncertainty of having the Checkpoint record being correctly
propagated to the standby before standby's checkpoint is triggered.
Add the timestamp to the application name to reduce the risk of leftover
query and connection being present when metrics are collected.
@bonnefoa bonnefoa force-pushed the bonnefoa/pg-walsender-lsn-delay branch from 45beba7 to c6437d1 Compare November 6, 2025 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants