Skip to content

Commit 2de7b87

Browse files
committed
mctp: Add retry for one-time peer property queries on timeout
The function `query_peer_properties()` is called once during peer initialization to query basic information after the EID becomes routable. To improve reliability, this change adds a retry mechanism when the query fails with `-ETIMEDOUT`. Since these queries are one-time initialization steps, a single successful attempt is sufficient, and retrying enhances stability under transient MCTP bus contention or multi-master timing issues. Testing: ``` root@bmc:~# journalctl -xeu mctpd.service | grep Retrying Oct 29 00:35:21 bmc mctpd[31801]: mctpd: Retrying to get endpoint types for peer eid 10 net 1 phys physaddr if 4 hw len 1 0x20 state 1. Attempt 1 Oct 29 00:39:00 bmc mctpd[32065]: mctpd: Retrying to get endpoint types for peer eid 10 net 1 phys physaddr if 4 hw len 1 0x20 state 1. Attempt 1 Oct 29 00:39:01 bmc mctpd[32065]: mctpd: Retrying to get endpoint types for peer eid 10 net 1 phys physaddr if 4 hw len 1 0x20 state 1. Attempt 2 Oct 29 00:45:08 bmc mctpd[32360]: mctpd: Retrying to get endpoint types for peer eid 10 net 1 phys physaddr if 4 hw len 1 0x20 state 1. Attempt 1 ``` Signed-off-by: Daniel Hsu <[email protected]>
1 parent 8b019a3 commit 2de7b87

File tree

1 file changed

+47
-13
lines changed

1 file changed

+47
-13
lines changed

src/mctpd.c

Lines changed: 47 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2835,23 +2835,57 @@ static int method_learn_endpoint(sd_bus_message *call, void *data,
28352835
// and routable.
28362836
static int query_peer_properties(struct peer *peer)
28372837
{
2838+
const unsigned int max_retries = 4;
28382839
int rc;
28392840

2840-
rc = query_get_peer_msgtypes(peer);
2841-
if (rc < 0) {
2842-
// Warn here, it's a mandatory command code.
2843-
// It might be too noisy if some devices don't implement it.
2844-
warnx("Error getting endpoint types for %s. Ignoring error %d %s",
2845-
peer_tostr(peer), rc, strerror(-rc));
2846-
rc = 0;
2841+
for (unsigned int i = 0; i < max_retries; i++) {
2842+
rc = query_get_peer_msgtypes(peer);
2843+
2844+
// Success
2845+
if (rc == 0)
2846+
break;
2847+
2848+
// On timeout, retry
2849+
if (rc == -ETIMEDOUT) {
2850+
if (peer->ctx->verbose)
2851+
warnx("Retrying to get endpoint types for %s. Attempt %u",
2852+
peer_tostr(peer), i + 1);
2853+
continue;
2854+
}
2855+
2856+
// On other errors, warn and ignore
2857+
if (rc < 0) {
2858+
if (peer->ctx->verbose)
2859+
warnx("Error getting endpoint types for %s. Ignoring error %d %s",
2860+
peer_tostr(peer), -rc, strerror(-rc));
2861+
rc = 0;
2862+
break;
2863+
}
28472864
}
28482865

2849-
rc = query_get_peer_uuid(peer);
2850-
if (rc < 0) {
2851-
if (peer->ctx->verbose)
2852-
warnx("Error getting UUID for %s. Ignoring error %d %s",
2853-
peer_tostr(peer), rc, strerror(-rc));
2854-
rc = 0;
2866+
for (unsigned int i = 0; i < max_retries; i++) {
2867+
rc = query_get_peer_uuid(peer);
2868+
2869+
// Success
2870+
if (rc == 0)
2871+
break;
2872+
2873+
// On timeout, retry
2874+
if (rc == -ETIMEDOUT) {
2875+
if (peer->ctx->verbose)
2876+
warnx("Retrying to get peer UUID for %s. Attempt %u",
2877+
peer_tostr(peer), i + 1);
2878+
continue;
2879+
}
2880+
2881+
// On other errors, warn and ignore
2882+
if (rc < 0) {
2883+
if (peer->ctx->verbose)
2884+
warnx("Error getting UUID for %s. Ignoring error %d %s",
2885+
peer_tostr(peer), -rc, strerror(-rc));
2886+
rc = 0;
2887+
break;
2888+
}
28552889
}
28562890

28572891
// TODO: emit property changed? Though currently they are all const.

0 commit comments

Comments
 (0)