Skip to content

Conversation

@frankmcsherry
Copy link
Member

The upsert.rs operator was doing non-trivial work as it read its input (inserting records into a priority queue). This has the potential to trap workers, who may not be able to retire input as fast as it is produced. Ideally, operators would take a fixed amount of input data (e.g. whatever is available when scheduled) and only do that work, rather than continuing to pull in new data as they run.

This modification makes the workers just drain their inputs without performing work, and then insert the now bounded number of records into the priority queue.

cc: @cirego

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants