-
Notifications
You must be signed in to change notification settings - Fork 130
[PULP-553] Enable immediate task execution on content app (1/2) #6937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b7ce573
to
f636c74
Compare
f636c74
to
7c7b0f6
Compare
pulpcore/app/tasks/repository.py
Outdated
new_version.add_content(models.Content.objects.filter(pk__in=add_content_units)) | ||
|
||
|
||
async def async_add_and_remove( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is always run as a task (at least it really should be). So let's not duplicate it. Let's just make it async for real.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here's another thing to watch out for: Both content creation from an artifact as well as creating a new repository version may involve an amount of calculation that is out of our control. Depending on the plugin, this may run into timeouts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice observation. If that starts to happen often we should probably not call it as immediate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problematic thing is there is no way to know. And once we timed out the task there is no recovering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is always run as a task (at least it really should be). So let's not duplicate it. Let's just make it async for real.
In ansible it's called directly (inside a task, but not by means of dispatch). https://github.com/pulp/pulp_ansible/blob/37d1ce28ce2e3743b82e4d5b5b4b5461896e7935/pulp_ansible/app/tasks/mark.py#L22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problematic thing is there is no way to know. And once we timed out the task there is no recovering.
If I remember correctly, that is the case only for the content app, right? It's handling content serving requests, so it can't report this kind of side-effect failure.
What if we dispatch to workers in case of timeout failure? 👀
pulpcore/app/models/task.py
Outdated
self.arefresh_from_db() | ||
raise RuntimeError( | ||
_("Falied to set task {} unblocked in state '{}'.").format(self.pk, self.state) | ||
_("Failed to set task {} unblocked in state '{}'.").format(self.pk, self.state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing on L345.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I don't think it's wise to translate error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I remember a wider team discussion where we agreed on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async_pull_through_add_content
can replace pull_through_add_content
. This one is only used in pulpcore.
I will do that on the last step, which is adding this to the content app handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdellweg can you checkpoint here? And wdyt of breaking this up in preparation + fix ? Fix meaning actually calling the async dispatch from content app, not only in the unit test. If you agree, I can mark it ready to review, rebase and slighly rename the title. At this point (I hope) this PR doesn't change any behavior (apart from some log message contents). |
async def test_async_pull_through_add(ca1, monkeypatch): | ||
# setup dispatch context | ||
set_guid(uuid.uuid4()) | ||
app_status = await AppStatus.objects.acreate(app_type="content", name="test_runner") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit dangerous. I think we need to mock out the registered app status variable so it gets reset after the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont understand what you mean that is dangerous. The AppStatus is required to exist in the test database because task creation needs a valid reference to it on the db level. And it's being cleaned up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AppStatus.objects keeps a record of this when creating, so it can implement the current()
.
If you'd write a second such test, it would fail.
And if we don't address this now, we would probably have a really hard time understanding what's wrong later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, its kept in the _current_app_status
.
Can I implement a delete override in the AppStatus.objects to clear that state if the deleted object matches the current app for the process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work:
https://github.com/pulp/pulpcore/blob/main/pulpcore/tests/unit/models/test_task.py#L16
Restarting once failed tasks is not safe. So running into timeouts is final. |
9fc8b2e
to
85700a1
Compare
85700a1
to
c731271
Compare
|
||
async def aexecute_task(task): | ||
# Store the task id in the context for `Task.current()`. | ||
current_task.set(task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is setting a context var. And I think you are missing the separation from the calling context (see _execute_task
).
(We should remember to pull all the context vars setting into here... but not this PR.)
We should be able to directly dispatch from async context using async-ready functions. Refactoring was done to make it easier to maintain both sync/async version of the respective functions. Co-authored-by: Matthias Dellweg <[email protected]>
c731271
to
58f03c6
Compare
Plan:
- [ ] Change content app to use async pull_through_add_content and remove sync one (only used here)(left to another PR)