-
Notifications
You must be signed in to change notification settings - Fork 29
MCTP Bridge Polling #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
MCTP Bridge Polling #85
Conversation
Why not just store the allocated pool range on the bridge peer? This would be a considerably simpler change, and means the seaprate |
For now we are storing the start of pool and pool size (in some sense it represents allocated pool EID range), would this requirement still needed? |
Not sure I understand the question - if the In your eccc13f change, the |
I see, I think I understand what your ask is here, peer->pool_start and peer->pool_size should represent "allocated" information (coming as response to ALLOCATE_EID command whatever bridge's final say is). We are initially storing the "proposed" pool start information and minimum of "proposed" pool size and actual pool_size (response from SET ENDPOINT ID) over these two vars and request is sent for ALLOCATION based on the final proposed data, but once we get a response, whatever has been allocated by bridge, we are keeping it as pool_start and pool_size inside peer. Then later marking this range as reserved in the new reserve_eid_set per network. But I see your point now, our bridge could have any pool_size as actual size coming in response to SET_ENDPOINT_ID but the proposed size could differ and so could finally allocated size too(response of ALLOCATE_ENDPOINT_ID) May be we can introduce a new var representing actual_pool_size (response of SET_ENDPOINT ID) and call out these two vars, peer->pool_start and peer->pool_size as allocated data while network reserved_eid_set continues to interact with these vars as it has been doing. |
I also have added a TODO comment here todo in our current implementation of Busowner pool and local eid assignment, we don't have proper check to see, if localEID which is being set via mctp tool (mctp addr add <eid> dev <interface>) for any network, has already been used by Busowner when it assigned dynamic/static EID to any MCTP device/endpoint. Reference: I have two interfaces mctpusb0 and mctpspi0_2. I try to add another local EID to mctspi0_2 (EID 20) . Here EID 20 was already used up via Bridge Downstream EID under mctpusb0 thus the error message, I still see mctp link map retaining the eid 20 even though we had error in setting local eid (not seen in mctp tool but mctpd) Not sure if this was intentionally done?
Since we are getting into the reservation of EIDs, our new localEID could very well be from the reserved EID space. So should we introduce restriction of such manner in mctp tool/mctpd? |
Yes, that's along the lines of what I was thinking. But maybe with the names adjusted: And then as an improvement: since the
There's not much we can do about that case; mctpd isn't (currently) managing that local EID, so there is not much we can do on conflicts - as you have noted.
No, not while mctpd is not responsible for assigning local EIDs. I see the best way to prevent this case is a facility to explicitly configure mctpd with the range of dynamically-allocatable EIDs, and it's up to the platform integrator to ensure that that range does not conflict with the potential set of local EIDs. (that's one of the reasons for the configuration file support, for exactly this use-case) |
Table 23 – Allocate Endpoint IDs message An error completion code (ERROR_INVALID_DATA should be returned) shall be returned if the number of EIDs being allocated (Number of Endpoint IDs) exceeds the Dynamic Endpoint ID Pool size. (This error condition does not apply to when the number of endpoint IDs passed in the request is 0x00).
Acknowledged. |
If there is no way to better handle the transient requested peer size, I would prefer to just store it on the
We have the capacity of the brige, from the Set Endpoint ID response, no? As you say:
I would suggest:
If the allocation is not possible in (2), just fail for now. We can look at fallbacks in a follow-up. I think this it fine to do without a new dbus call, but am open to ideas about why we may need one. |
Please let me know if I understood you correctly on steps
Introduces req_pool_size and req_pool_start, updates pool_size -> alloc_pool_size and pool_start ->alloc_pool_start. If we are going by this then do we have to get pool_start argument from dbus method AssignBridgeStatic if we have already decided what pool start will be from req_pool_start? Or do you mean not to use any new dbus method AssignBridgeStatic at all?
|
Since the allocation mechanism is now in #71, I'll shift this discussion to there. |
2746a86
to
a472892
Compare
Hello Jeremy, |
I'm going to hold off on reviewing this one until we have the core bridge support done, given that's a hard dependency for this. |
understood, sounds good |
Hello Jeremy, graph LR
A["HMC/BMC"] --> B["MCTP Bridge"]
B --> C["GPU1"]
B --> D["GPU2"]
I'm aware that in the (discussed) polling approach, mctpd doesn't care about state of devices behind the bridge once they have been discovered. If this seems like a legit reason to you for which we can think of keeping continuous polling, I have further question on implementation The particular use case which might not get solved even with Polling is fast reset of downstream endpoints such as GPUs themselves which happens in time-window of couple of milliseconds (100-200 msec) since polling window is not be narrow enough to capture such fast reset and from MCTP perspective endpoint never gotten off the MCTP network. In such case NSM would not even get notified about this fast-reset of GPU. Need some help with figuring out how do we expect endpoint to notify or BusOwner to become aware of such fast reset use case. I'm open to any further guidance/opinion on Polling mechanism itself, or should MCTP even care about this etc. Or may be help me understand this better on the philosophy of mctpd not caring about state of device downstream to bridge or direct-peer itself. |
The main design point here is that mctpd manages the MCTP transport itself, so is only concerned with the state of connectivity to managed MCTP devices. It does not manage the general state of those devices, like your scenario above. For example, in the fast-reset case: nothing has changed in mctpd's view of the network. The device has been reset, but it sounds like it has restored some MCTP state, in that it is still using its assigned EID - so no connectivity has been changed. (or does a reset clear the EID? in which case the reset would be visible to mctpd?) From what you have mentioned though, the issue there seems to be that the device is retaining some state (the EID) but not other state (the event subscriptions). We do not perform ongoing polling of the MCTP devices, as we cede that responsibility to upper-level applications once connectivity has been established. If those applications detect a loss of connectivity, they call This means that we do not detect endpoints that may disappear but have no upper-level applications interacting with them. This is a deliberate decision - if nothing is communicating with them, do we care that we have connectivity? For the reset-detection in general though, some upper-layer protocols have mechanisms to detect that, which would typically re-trigger some form of setup. |
Thank you for the clarification, your words do make sense to me.
for our platform devices, yes it does but it's given back same EID by the Bridge, meaning once device is out of reset it responds to next MCTP command (GET_ENDPOINT_ID). Reset would be visible to
Well this is the problem I'm currently facing, Nvidia NSM doesn't have a direct mechanism of checking reset that too which is happening that quick (100msec). Even if they do something like polling to keep in check of state of device this too would be not that frequent to find such quick reset. Had this been a direct endpoint (non Bridge) to BMC, would MCTP Discovery Notify command be the right way for endpoint to rely on to instate its reset, as in that case eid would be reset for the endpoint and it would be freshly added endpoint to MCTP network? If so, is there any plan to include this support on MCTPD? From your previous comments and other issue discussion I suppose your take is different when it comes to Discovery Notify and when it is to be used. |
Ah, so there's no direct connection, I see.
No, the discovery notify command is only between the endpoint and its direct bus owner. From DSP 0236 § 12.15 "Discovery Notify":
To me, the spec reads as if the intention for Discovery Notify is to provide a bus-hotplug mechanism when the physical bus does not already provide it. Even without the direct connection, it sounds like you're attempting to use the MCTP control protocol to communicate upper-layer protocol state. I would think that that device state (ie, the event subscriptions and/or reset condition) really belong as part of that upper-layer protocol. For example, NVMe-MI has facilities to indicate that a subsystem has been reset (see the "NVM Subsystem Reset Occurred" event), perhaps you could do something similar here? [I am not familiar with NSM though] |
Thank you for clarification (I agree with you, may be there lies some other way to handle this use-case, I'll go over NVMe-MI implementation and NVM Subsystem Reset Occurred event). I'll update the PR with rebase and some more modification. Please hold on the review till then. |
fdaa069
to
79e5e46
Compare
Hello Jeremy, Updated the PR with last discussion on stopping of polling. It's up for review now. Sorry to be reluctant on this, but is there any plan to support Discovery Notify for direct peer to BMC as bus owner specifically for some binding where we don't explicitly have methods like I2C bus detection to identify endpoint presence? |
cf98296
to
c892110
Compare
Thanks for the updates. You now seem to have a bunch of fixes/changes to the bridging implementation, which don't belong in the polling feature addition. Can you separate those into a different PR please? |
Created a new pr for tracking any other fixes needed for bridge support. But keeping these commits in this PR too for addressing to test cases, will rebase once other PR gets approved |
Refering to DSP0236 sec 8.17.6 Reclaiming EIDs from hot-plug devices Requirement: The bus owner/bridge needs to periodically poll GET_ENDPOINT_ID control command to check if downstream endpoint is accessible. Once it's established that endpoin is accessible, polling needs to stop. Introduce new endpoint polling configuration to be used as periodic interval for sending poll command GET_ENDPOINT_ID by bus owner/bridge. Signed-off-by: Faizan Ali <[email protected]>
Since downstream endpoints are behind the bridge, available query_get_endpoint_id() won't be able to redirect the GET_ENDPOINT_ID command packet behind the bridge physical address of downstream endpoint would be same as bridge's own. But we can do direct query to endpoint since routes have already been layed out when bridge was allocated the eid pool space. Signed-off-by: Faizan Ali <[email protected]>
Implement endpoint periodic polling mechanism to validate bridged endpoint accessiblity. Begin polling as soon as gateway routes are created. Stop polling once it's established that endpoint path is accessible. Publish peer path once downstream endpoint responds to send poll command. Signed-off-by: Faizan Ali <[email protected]>
Remove all downstream endpoints as well if bridge is being removed. Also stop endpoint periodic polling. Signed-off-by: Faizan Ali <[email protected]>
Added new bus-owner configuration for periodic polling of bridged endpoints. Signed-off-by: Faizan Ali <[email protected]>
Currently find_endpoint incorrectly tries to find physical address for sent EID. This fails in case EID is a gatewayed endpoint because don't have neighbors or physical address. For gateway routes, return the gateway's physical address instead. The gateway endpoint will then forward the message to the correct bridged endpoint internally via looking into its bridged endpoint list. Signed-off-by: Faizan Ali <[email protected]>
Test that bridge polling discovers downstream endpoints and creates endpoints d-bus object. Test that polling stops once downstream endpoint is discovered. and continues unless endpoint reponds to send polling command GET_ENDPOINT_ID. Test that once brige endpoint is removed, downstream endpoints associated to the brige also gets removed too. Signed-off-by: Faizan Ali <[email protected]>
c892110
to
d48c392
Compare
Signed-off-by: Faizan Ali <[email protected]>
d48c392
to
a08c28f
Compare
rebased the PR |
MCTP Bridge Endpoint poll
Reference : DSP0236 section 8.17.6 Reclaiming EIDs from hot-plug devices
The bus owner/bridge can detect a removed device or devices by validating the EIDs that are presently allocated to endpoints that are directly on the bus and identifying which EIDs are missing. It can do this by attempting to access each endpoint that the bridge has listed in its routing table as being a device that is directly on the particular bus. Attempting to access each endpoint can be accomplished by issuing the Get Endpoint ID command.
This PR aims to extend
mctpd
as bus owner to enhance bridge support feature by incorporating downstream endpoint polling.One way to detect the accessibility of its downstream devices is for bus owner/bridge to probes on each bridge's downstream endpoint using the Get Endpoint ID command.
Once its established that endpoint path is accessible, we would stop polling to that eid and publish the eid endpoint object
This PR is based discussions over MCTP Bridge Support : #71