Skip to content

Conversation

pedro-psb
Copy link
Member

@pedro-psb pedro-psb commented Sep 11, 2025

Plan:

  • Refactor sync dispatch and execute
  • Add async dispatch and execute task functions
  • Add async pull_through_add_content and unit test it
    - [ ] Change content app to use async pull_through_add_content and remove sync one (only used here) (left to another PR)

@pedro-psb pedro-psb changed the title Enable immediate task execution on content app [PULP-553] Enable immediate task execution on content app Sep 11, 2025
@pedro-psb pedro-psb force-pushed the fix/immediate-on-content-app branch from b7ce573 to f636c74 Compare September 11, 2025 19:48
@pedro-psb pedro-psb force-pushed the fix/immediate-on-content-app branch from f636c74 to 7c7b0f6 Compare September 11, 2025 22:01
new_version.add_content(models.Content.objects.filter(pk__in=add_content_units))


async def async_add_and_remove(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is always run as a task (at least it really should be). So let's not duplicate it. Let's just make it async for real.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here's another thing to watch out for: Both content creation from an artifact as well as creating a new repository version may involve an amount of calculation that is out of our control. Depending on the plugin, this may run into timeouts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice observation. If that starts to happen often we should probably not call it as immediate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problematic thing is there is no way to know. And once we timed out the task there is no recovering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is always run as a task (at least it really should be). So let's not duplicate it. Let's just make it async for real.

In ansible it's called directly (inside a task, but not by means of dispatch). https://github.com/pulp/pulp_ansible/blob/37d1ce28ce2e3743b82e4d5b5b4b5461896e7935/pulp_ansible/app/tasks/mark.py#L22

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problematic thing is there is no way to know. And once we timed out the task there is no recovering.

If I remember correctly, that is the case only for the content app, right? It's handling content serving requests, so it can't report this kind of side-effect failure.

What if we dispatch to workers in case of timeout failure? 👀

self.arefresh_from_db()
raise RuntimeError(
_("Falied to set task {} unblocked in state '{}'.").format(self.pk, self.state)
_("Failed to set task {} unblocked in state '{}'.").format(self.pk, self.state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing on L345.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't think it's wise to translate error messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I remember a wider team discussion where we agreed on that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async_pull_through_add_content can replace pull_through_add_content. This one is only used in pulpcore.
I will do that on the last step, which is adding this to the content app handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedro-psb
Copy link
Member Author

@mdellweg can you checkpoint here?

And wdyt of breaking this up in preparation + fix ? Fix meaning actually calling the async dispatch from content app, not only in the unit test. If you agree, I can mark it ready to review, rebase and slighly rename the title.

At this point (I hope) this PR doesn't change any behavior (apart from some log message contents).

async def test_async_pull_through_add(ca1, monkeypatch):
# setup dispatch context
set_guid(uuid.uuid4())
app_status = await AppStatus.objects.acreate(app_type="content", name="test_runner")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit dangerous. I think we need to mock out the registered app status variable so it gets reset after the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand what you mean that is dangerous. The AppStatus is required to exist in the test database because task creation needs a valid reference to it on the db level. And it's being cleaned up.

Copy link
Member

@mdellweg mdellweg Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppStatus.objects keeps a record of this when creating, so it can implement the current().
If you'd write a second such test, it would fail.

And if we don't address this now, we would probably have a really hard time understanding what's wrong later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, its kept in the _current_app_status.
Can I implement a delete override in the AppStatus.objects to clear that state if the deleted object matches the current app for the process?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdellweg
Copy link
Member

Restarting once failed tasks is not safe. So running into timeouts is final.
And now I'm wondering if we ever want the content app to perform task work. It all blurs completely what part of the app is responsible for what part of the work.
OTOH having an async version of dispatch is probably a good idea. So yes, go ahead.

@pedro-psb pedro-psb force-pushed the fix/immediate-on-content-app branch from 9fc8b2e to 85700a1 Compare September 15, 2025 11:32
@pedro-psb pedro-psb marked this pull request as ready for review September 15, 2025 11:32
@pedro-psb pedro-psb changed the title [PULP-553] Enable immediate task execution on content app [PULP-553] Enable immediate task execution on content app (1/2) Sep 15, 2025
@pedro-psb pedro-psb force-pushed the fix/immediate-on-content-app branch from 85700a1 to c731271 Compare September 15, 2025 18:22

async def aexecute_task(task):
# Store the task id in the context for `Task.current()`.
current_task.set(task)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is setting a context var. And I think you are missing the separation from the calling context (see _execute_task).

(We should remember to pull all the context vars setting into here... but not this PR.)

We should be able to directly dispatch from async context using
async-ready functions. Refactoring was done to make it easier to
maintain both sync/async version of the respective functions.

Co-authored-by: Matthias Dellweg <[email protected]>
@pedro-psb pedro-psb force-pushed the fix/immediate-on-content-app branch from c731271 to 58f03c6 Compare September 16, 2025 15:07
@pedro-psb pedro-psb merged commit deee651 into pulp:main Sep 16, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants