Skip to content

Conversation

@klihub
Copy link
Collaborator

@klihub klihub commented Sep 2, 2025

This PR improves CPU resource allocation for burstable QoS class containers.

Currently CPU burstability is not taken into account when a pool is picked for a container. For containers with high burstability (large CPU limit - CPU request), this can result in an assignment which cannot have any or enough excess CPU capacity for bursting.

This PR changes that behavior. When picking a pool for a burstable QoS class container, take into account the CPU limit, too. For limited burstable containers, prefer pools with enough free capacity for the limit. For unlimited burstable containers, prefer pools at a topology level indicated by the new unlimitedBurstable configuration option, with a default of package indicating sockets. This can be overridden on a per pod or container level using the new unlimited-burstable.resource-policy.nri.io annotation. Unlimited containers will typically end up in a pool at the configured or annotated topology level, unless affinity, memory types, or topology hints dictate otherwise.

@klihub klihub force-pushed the devel/topology-aware/pick-pool-by-burstable-limit branch 2 times, most recently from 97af7d0 to 21943c5 Compare September 5, 2025 06:46
@klihub klihub force-pushed the devel/topology-aware/pick-pool-by-burstable-limit branch from 21943c5 to 967f30c Compare November 4, 2025 06:58
@klihub klihub marked this pull request as ready for review November 4, 2025 06:58
@klihub
Copy link
Collaborator Author

klihub commented Nov 4, 2025

@askervin @fmuyassarov PTAL

@klihub klihub force-pushed the devel/topology-aware/pick-pool-by-burstable-limit branch 2 times, most recently from 93af3c6 to 51d20ba Compare November 5, 2025 13:37
Add topology-aware configuration option for which topology level
we try to allocate containers with unlimited burstability at. To
allow sharing the existing topology level configuration str-enum
move it from policy/balloons to policy and reuse it in topology-
aware as well.

Signed-off-by: Krisztian Litkey <[email protected]>
@klihub klihub force-pushed the devel/topology-aware/pick-pool-by-burstable-limit branch 2 times, most recently from 06b6230 to e2f3b4c Compare November 5, 2025 17:37
When picking a pool for a burstable QoS class container, take
into account the CPU limit, too. For limited containers prefer
pools with enough free capacity for the limit. For unlimited
containers prefer pools with more free capacity left, which
means in practice that such containers will typically end up
in the root pool, unless affinity of topology hints dictate
otherwise.

Signed-off-by: Krisztian Litkey <[email protected]>
Update existing burstable placement test for altered behavior.
Add new test to verify CPU limit based burstable placement of
limited and unlimited burstable containers.

Signed-off-by: Krisztian Litkey <[email protected]>
@klihub klihub force-pushed the devel/topology-aware/pick-pool-by-burstable-limit branch from e2f3b4c to 192c194 Compare November 5, 2025 17:42
@klihub
Copy link
Collaborator Author

klihub commented Nov 5, 2025

@askervin @fmuyassarov I added configurability for what unlimited burstability means. There is a global configuration option and a per-pod and container annotation to override the global default. I think this would be now ready for review.

@klihub klihub requested a review from askervin November 6, 2025 12:27
Copy link
Collaborator

@askervin askervin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very nice, thanks @klihub!

The last thing I was left pondering is whether or not the documentation should state what is the default for the default unlimitedBurstable option when it is not explicitly given. On one hand, I would think it would be valuable to document that by default burstable containers without CPU limit will be limited to the socket level, but on the other hand, none of the options in "Configuring the Policy" section tell what's the default when omitted. Making this an exception would not look too good, either.

That's why I'd think this PR is good as is, and documenting the defaults for this and other options, if necessary, would be a topic for separate PR(s).

@klihub
Copy link
Collaborator Author

klihub commented Nov 6, 2025

LGTM, very nice, thanks @klihub!

The last thing I was left pondering is whether or not the documentation should state what is the default for the default unlimitedBurstable option when it is not explicitly given. On one hand, I would think it would be valuable to document that by default burstable containers without CPU limit will be limited to the socket level, but on the other hand, none of the options in "Configuring the Policy" section tell what's the default when omitted. Making this an exception would not look too good, either.

That's why I'd think this PR is good as is, and documenting the defaults for this and other options, if necessary, would be a topic for separate PR(s).

I was also thinking about adding a mention of the (current) default to the documentation, because it makes sense to document it somewhere. The counter-argument for putting it into the documentation would be that if we ever to decide changing it then there is a danger of stale/incorrect documentation, if we forget to update them.

Now the default is both set and 'documented' in the Custom Resource Definition generated for configuration, in one place so there is less danger of it going out of sync. But this is so iffy, that I almost added it to the docs. And I still don't know which one would really make more sense. Maybe a separate chapter listing all the configuration defaults, which we could also then link from the Helm chart (docs) which is the primary way of consuming the plugins.

@askervin
Copy link
Collaborator

askervin commented Nov 7, 2025

I was also thinking about adding a mention of the (current) default to the documentation, because it makes sense to document it somewhere. The counter-argument for putting it into the documentation would be that if we ever to decide changing it then there is a danger of stale/incorrect documentation, if we forget to update them.

What would you think about a solution somewhere in the middle ground, that is, "make generate" or something would actually generate docs/resource-policy/policy/topology-aware-defaults.md that we could refer to in topology-aware.md? (And probably the same for balloons....)

@klihub
Copy link
Collaborator Author

klihub commented Nov 7, 2025

I was also thinking about adding a mention of the (current) default to the documentation, because it makes sense to document it somewhere. The counter-argument for putting it into the documentation would be that if we ever to decide changing it then there is a danger of stale/incorrect documentation, if we forget to update them.

What would you think about a solution somewhere in the middle ground, that is, "make generate" or something would actually generate docs/resource-policy/policy/topology-aware-defaults.md that we could refer to in topology-aware.md? (And probably the same for balloons....)

If we want to document all the configuration defaults in one place, I think we would need something like that. I'm just not sure if there is enough existing tooling to do it without actually installing the CRD on a cluster, creating a CR instance, then reading out how it came out to be...

@askervin askervin merged commit d745e29 into containers:main Nov 10, 2025
10 checks passed
@klihub klihub deleted the devel/topology-aware/pick-pool-by-burstable-limit branch November 10, 2025 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants