-
Notifications
You must be signed in to change notification settings - Fork 130
Description
This summariezes the known state of the problem with postgres listen/notify and present some approaches for solving it. Please share some thoughts on any missing part and provide opinions on what route to take.
Related issues/jiras:
- https://issues.redhat.com/browse/PULP-674
- pulp-worker creates some database pressure when scaling beyond 250 workers #6616
Summary
Problems with listen/notify:
- Degradates connection pooling effectivness ref-00: Listening on a postgres channel causes connection pinning on RDS Proxy. Notifying doesn't.
- Notify might be interfering with performance. ref-01: Notify aquires a global table lock which might degrade performance with multiple writes. We don't know if Pulp is being hit by this. Also, as we can't target notifications to specific workers (everybody receives the data), the pubsub might be generating too much db activity (not proven to be a problem).
Current channels: ref-02
pulp_worker_wakeup
: When resources are available workers get notified to wake up. The notify can be called by any component that is able to do dispatch (every Pulp component) or is able to request task cancelation (only API?). In the worker, notify it's called on task completion or the cancelling logic.pulp_worker_metrics_heartbeat
: Only used when otel is enabled. Notify can be called by any workers at every heartbeat. Workers race for the metric notify lock.pulp_worker_cancel
: Notify is called by the API.
Currently, only worker components listens for notifications.
Additional context on RDS Proxy
RDS Proxy sits in the middle of the client (in our case, we are interested in tasking workers) and the database. The connections of RDS with the database are called database connection. The connection between the client and RDS are called client connections. The whole idea is to enable multiplexing client connections on database connections. Under some conditions (e.g, listening on a postgres channel), client connections are pinned, which means the client connection gets pinned to a database connection, blocking it to be 'reused' by other client connections.
Learn more in ref-00.
General Approach
Define a general API for pubsub which can use different backends.
That allows to still use postgres listen/notify on installations that don't need connection pooling.
An example is provided below:
# Start the pubsub client on a thread.
# default to pg listen/notify implementation
pubsub = PGPubSub()
pubsub = RedisPubSub()
pubsub = EtcdPubSub()
pubsub.subscribe("channel", callback)
pubsub.unsubscribe("channel")
pubsub.publicate("channel", "optional-msg")
Specific Implementations considered
There are some centralized based solutions, some distributed based and some mixed.
1. redis pubsub
Workers listen/notify to redis channels.
- Pros:
- known technology
- simple
- Cons:
- single-point of failure
2. etcd cluster
Workers watch/update on specific keys (used as channels). The setup requires an etcd service running on each node to handle the kv-store replication. It stores a Write Ahead Log on disk and has an API that the worker talks to (directly or via a client).
It's an overkill just for the listen/notify, but its capabilities can make it easier to offload and improve other coordination tasks, such as various locks (unblocking, recording metrics, scheduling) and possibly reduce workers racing for writes and task lookups on the db.
- Pros:
- empowers the tasking system with distributed coordination primitives (e,g builtin locks, leases and (re)-election)
- hight availability / fault tolerant (replicated logs on each node through RAFT)
- Relatively lightweight and robust
- Cons:
- unknow technology
- adds complexity for deployment. All components would require access to an etcd instance in it's node (to enable doing 'notify', for example)
There are other similar distributed services which uses consensus algorithms. E.g:
- NATS/jetstream https://docs.nats.io/running-a-nats-service/configuration/clustering/jetstream_clustering
- Zookeper: https://zookeeper.apache.org/
- Unlike etcd, uses a centralized service for managing nodes. Feels too heavy/complex
- Is used by Kafka
3. postgres-websockets
Workers talk to the websocket middleware provided by the postgres-websocket service. It uses listen/notify from postgres, but workers will talk to the middleware, so it doesnt' degrade connection pooling.
- Pros:
- We continue to use notify as we do today
- Cons:
- single point of failure
4. custom web-socket based solution:
Have a leader connected to all workers through websocket. The leader is the only component listening postgres notifications. That will cause the connection to be 'pinned' by the connection pool, but it's always only one, no matter how many workers.
As a variant without any postgres listen/notify, components that needs to notify
could have access to the workers network and talk directly to the current leader.
- Pros:
- No new components required
- high availability (assuming a robust re-election implementation)
- Cons:
- Complexity of handling leader election and consensus
Options considered
- Worker polling for updates: We get back to the problem of a herd of workers stressing the database.