Skip to content

Conversation

@AntonRoskvist
Copy link
Contributor

This turned out to be quite a bit more involved than what I originally anticipated, please let me know if anything is missing or is otherwise looking strange

@gtully
Copy link
Contributor

gtully commented Jun 5, 2024

Have you seen:

with the zk locker we can have peer brokers that coordinate on a shared node id. it is called the coordinator-id in the zk activation config, but ends up as the node-id. With the potential for unknown backups with replication, the node id has to be reset. However this new config could make it simpler for pure peers, where there is no replication b/c the it is non persistent or uses a shared store.

I guess we need to allow these to coexist or document that they must be used independently.

@gtully
Copy link
Contributor

gtully commented Jun 5, 2024

Another use case is jdbc, it currently uses a new uuid as the node id, it too could benefit from a shared identity, allowing a restart/reconnect of an existing lock owner to retain a valid lease. At the moment, any restart results in a a full lease expiry because the node id is not consistent.
If the nodeid is configured through this logic then we can avoid the id change and speed up recovery/restart

https://github.com/apache/activemq-artemis/blob/main/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/impl/jdbc/JdbcNodeManager.java#L70

@AntonRoskvist
Copy link
Contributor Author

Thanks for your feedback @gtully

If I am not mistaken the first case should be addressed already, where the coordination-id takes precedence over the configured nodeID. It should be verified by: org.apache.activemq.artemis.tests.integration.replication.LockManagerReplicationTest#testPrimaryPeers() let me know if I misunderstood though.

I will take a look at the JDBC case, thanks!

@AntonRoskvist
Copy link
Contributor Author

@gtully Do you know of any current test that demonstrates (or at least includes) having to wait for a lease expiry for the JDBC case?

@gtully
Copy link
Contributor

gtully commented Jun 10, 2024

@gtully Do you know of any current test that demonstrates (or at least includes) having to wait for a lease expiry for the JDBC case?

public void shouldAcquireExpiredLock() throws InterruptedException {

that looks like it should have to wait.

@jsmucr
Copy link
Contributor

jsmucr commented Jul 9, 2025

Any potential updates on this?
Today I've come across this when replacing a master node with an entirely new one, and despite the configuration being correct, I was left with two live nodes instead of a live node and a replica.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants