rabbitmq_*_federation: Stop links during plugin stop #14054
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why
Links are started by the plugins but put under the
rabbit
supervision tree. The federation plugins supervision tree is empty unfortunately...Links are stopped by a boot step executed by
rabbit
, as a concequence of unregistering the plugins' parameters.Unfortunately, links can be terminated if the channel, and implicitly the connection stops. This happens when the
amqp_client
application stops.We end up with a race here:
Because the federation plugins supervision trees are empty and the application stop functions barely stop the pg group (which doesn't terminate the group members), nothing waits for the links to stop. Therefore,
rabbit
can stop `amqp_client' which is a dependency of the federation plugins. Therefore, the links underlying channels and connections are stopped.rabbit
unregister the federation parameters, terminating the links. The exchange linksterminate/2
function needs the channel to delete the remote queue. But the channel and the underlying connection might be gone.This simply logs a
badmatch
exception:How
The solution is to make sure links are stopped as part of the stop of the plugins.
rabbit_federation_pg:stop_scope/1
is expanded to stop all members of all groups in this scope, before terminating the pg scope itself. The new code waits for the stopped processes to exit.We have to handle the
EXIT
signal in the link processes and change their restart strategy in their parent supervisor from permanent to transient. This ensures they are restarted only if they crash. This also skips a error log message about each stopped link.