Skip to content

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Jan 25, 2025

Per discussion in #6125, denial-of-service attacks could be made against the KVS by very very large KVS transactions.

Support two configurations for capping the number of transactions made by users. One for each individual transaction made by a caller and one for the combined total of operations from a fence.

For the time being, I made the default 64K for the transaction cap and 1M for the fence cap.

I made this WIP only b/c those defaults may be tweaked depending on what stats we get from the prior PR #6556. I would like to merge only after we gather a bit of data, although I'd be quite shocked if we have to adjust the defaults. Edit: Or alternately, if we'd like to just get the code in, we could default the max to LLONG_MAX and lower the default at a later time.

Only other thought is I decided to return the errno E2BIG if we went across a max cap boundary. It's possible there is a superior errno for this, I picked it b/c I thought "ehhh that's not bad".

@chu11
Copy link
Member Author

chu11 commented Feb 3, 2025

re-pushed, removing all fence related parts of this PR, since KVS fence support is now gone

@chu11 chu11 force-pushed the issue6572_kvs_transaction_max branch from 07ed829 to 7b1939d Compare February 3, 2025 19:35
@chu11
Copy link
Member Author

chu11 commented Feb 3, 2025

re-pushed, removing all the fence stuff that is no longer relevant b/c KVS fence support was removed

@chu11 chu11 changed the title WIP: kvs: support configuration of max transaction count kvs: support configuration of max transaction count Feb 12, 2025
@chu11 chu11 force-pushed the issue6572_kvs_transaction_max branch 2 times, most recently from a651b3a to 5ed7562 Compare February 13, 2025 22:12
@chu11 chu11 changed the title kvs: support configuration of max transaction count WIP: kvs: support configuration of max transaction count Feb 14, 2025
@chu11 chu11 changed the title WIP: kvs: support configuration of max transaction count kvs: support configuration of max transaction count Feb 19, 2025
@chu11 chu11 force-pushed the issue6572_kvs_transaction_max branch 2 times, most recently from 6bcbcfe to 3706f06 Compare February 19, 2025 18:18
@chu11
Copy link
Member Author

chu11 commented Feb 19, 2025

Some small set of transaction stats

 "transaction-opcount": {
  "commit": {
   "count": 48829,
   "min": 1,
   "mean": 1.976673698007309,
   "stddev": 6.318251377545355,
   "max": 398
  }
 },

the max number of ops in a transaction is 398, which means my default cap of 64K seems more than fine. I'll still monitor stats until release time, but removing WIP for now so this can be reviewed

@chu11
Copy link
Member Author

chu11 commented Feb 28, 2025

saw this today

 "transaction-opcount": {
  "commit": {
   "count": 95589,
   "min": 1,
   "mean": 3.4195984893659341,
   "stddev": 42.249674028063289,
   "max": 11395
  }
 },

max of 11K ... perhaps upping to 128K over 64K would still be good but a bit safer.

@chu11 chu11 force-pushed the issue6572_kvs_transaction_max branch from 3706f06 to a8e4ffa Compare February 28, 2025 19:16
@garlick
Copy link
Member

garlick commented May 9, 2025

Before we do this, we should probably investigate why we are seeing these high op-count commits.

I think stdout events are batched by this code, which triggers the commit of a batch of events based on a timer. Maybe it should also have a high water mark on the operation count and/or the cumulative size of the events.

https://github.com/flux-framework/flux-core/blob/master/src/common/libeventlog/eventlogger.c

If we can make sure flux-core code is well behaved, then it makes sense to me to impose a limit to avoid regressions and bad behavior by framework projects, but I should think we could set it much lower than 128K. We should also provide some way for API users to know what the limit is. For example, then the above code could set its high water mark accordingly. Maye we could even just make it a constant in kvs.h.

Note this PR in the title and a few other places uses "transaction" where "operation" is meant. IMHO the title should be "kvs: limit the number of operations per commit".

chu11 added 3 commits July 10, 2025 09:36
Problem: A KVS denial of service is possible because there is
no maximum on the number of operations a user can submit in a KVS
transaction.  For example, a KVS transaction with billions of KVS
entries would lead to a severe degradation in KVS performance.

Support a new KVS configuration "transaction-max-ops" that will reject
KVS transaction with operations above a maximum count.  The default
maximum is 131072.

Fixes flux-framework#6572
Problem: The new kvs transaction-max-ops configuration
option is not documented.

Add documentation to flux-config-kvs(5)
Problem: There is no coverage for the new kvs transaction-max-ops
configuration.

Add coverage in t1005-kvs-security.t.
@chu11 chu11 force-pushed the issue6572_kvs_transaction_max branch from a8e4ffa to a93874b Compare July 10, 2025 16:36
Copy link

codecov bot commented Jul 10, 2025

Codecov Report

Attention: Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.

Project coverage is 83.84%. Comparing base (3323cf0) to head (a93874b).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/modules/kvs/kvs.c 90.90% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6581   +/-   ##
=======================================
  Coverage   83.83%   83.84%           
=======================================
  Files         539      539           
  Lines       90283    90305   +22     
=======================================
+ Hits        75693    75713   +20     
- Misses      14590    14592    +2     
Files with missing lines Coverage Δ
src/modules/kvs/kvs.c 74.60% <90.90%> (+0.26%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chu11 chu11 changed the title kvs: support configuration of max transaction count kvs: support configuration of max operations count Jul 11, 2025
@chu11
Copy link
Member Author

chu11 commented Jul 11, 2025

If we can make sure flux-core code is well behaved, then it makes sense to me to impose a limit to avoid regressions and bad behavior by framework projects, but I should think we could set it much lower than 128K. We should also provide some way for API users to know what the limit is. For example, then the above code could set its high water mark accordingly. Maye we could even just make it a constant in kvs.h.

Good point. I'll write up an issue to investigate this issue as well. However, this PR was initially developed under the idea to defend against a denial of service, so I think it is worthwhile to have the max independent of it. At this moment, a rogue user (or misbehaving code) could create a KVS transaction with like a bajillion operations in it right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants