Skip to content

Conversation

maxenglander
Copy link
Collaborator

Previously approved but merge into the wrong base: #2

Background

The current behavior of the exporter is to open a new database connection on every scrape.

First, here is where the default metrics collector and new-style collectors are registered:

prometheus.MustRegister(exporter)

prometheus.MustRegister(pe)

Prometheus may run these collectors in parallel. Additionally, the collectors package is set up to support concurrent scrapes:

// copy the instance so that concurrent scrapes have independent instances
inst := p.instance.copy()
// Set up the database connection for the collector.
err := inst.setup()
defer inst.Close()

No doubt it's useful for some to have the default and new-style metrics be collected concurrently, and to be able to support concurrent srapes.

Changes

But for PlanetScale, our metric collection system does not scrape the same endpoint concurrently, so concurrent scrapes aren't useful for us. We can also live without whatever time is gained by having default and new-style metrics be collected concurrently: our scrape timeout is 10s, and we expect metrics to be collected much faster than that. If not, then we probably have other problems we need to look at.

Additionally we want to consume as few customer connections as possible. So having the option of using a single, shared connection between default and new-style metrics is good for us. Additionally, if customers have used up all their connections, having each scrape create a new db conn might mean we can't produce metrics. So, having the option to use a single, shared, persistent connection is doubly useful.

Validation

Create a role with a connection limit of 1:

CREATE ROLE postgres_exporter WITH LOGIN CONNECTION LIMIT 1;
GRANT pg_monitor TO postgres_exporter;
GRANT CONNECT ON DATABASE postgres TO postgres_exporter;

With --no-concurrent-scrape

Start the exporter with concurrent scraping disabled:

DATA_SOURCE_NAME="postgresql://[email protected]:5432/postgres?sslmode=disable" ./postgres_exporter --no-concurrent-scrape --log.level=info

Scraping metrics shows no errors in output.

With --concurrent-scrape

Repeat with --concurrent-scrape, shows this error:

time=2025-09-06T20:56:44.568-04:00 level=ERROR source=collector.go:195 msg="Error opening connection to database" err="error querying postgresql version: pq: too many connections for role "postgres_exporter""

Connection resilience

Verified that --no-concurrent-scrape is resilient to Postgres connection resets. After a connection reset, the exporter will log an error:

time=2025-09-06T22:44:06.172-04:00 level=ERROR source=collector.go:180 msg="Error creating instance" err="driver: bad connection"

But the connection will be recreated and the scrape (or the next scrape anyway) will succeed. Behavior seems similar to master.

@maxenglander maxenglander marked this pull request as ready for review September 10, 2025 03:56
@maxenglander maxenglander merged commit 8ca1365 into main Sep 10, 2025
2 checks passed
@maxenglander maxenglander deleted the maxeng-disable-concurrent-scrapes branch September 10, 2025 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant