Skip to content

Conversation

gdrosos
Copy link
Contributor

@gdrosos gdrosos commented Aug 17, 2025

SUMMARY

Attempt to fix incorrect reporting when uninstalling default gems on Ubuntu.
Previously, the gem module reported changed: true even though default gems (e.g. json) cannot be removed.
On Ubuntu, gem uninstall exits with code 0 but leaves the gem installed, misleading the module.
With this patch, the module now re-checks after uninstall and fails if the gem is still present. I also added integration tests

Fixes #10451

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

gem

@ansibullbot
Copy link
Collaborator

cc @None @johanwiren
click here for bot help

@ansibullbot ansibullbot added bug This issue/PR relates to a bug integration tests/integration module module plugins plugin (any type) tests tests needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR labels Aug 17, 2025
@ansibullbot ansibullbot removed the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR label Aug 17, 2025
Copy link
Collaborator

@russoz russoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

Additionally, I would point that the module has been silently ignoring those gems that cannot be uninstalled until now, so popping an error out of the sudden is a breaking change.

I can imagine that some users might not care whether the specific package was uninstalled or not, while others might want to do something about it. I am thinking this might be better with some flag to determine how the module should behave in that case. I can think of 3 scenarios:

  1. ignore entirely (current behaviour)
  2. generate warning (module result is success)
  3. generate error (module result is failure)

community.general.gem:
name: json
state: absent
when: ansible_distribution == "Ubuntu"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good enough for now, but I would suppose this happens because that gem must be installed by default in a .deb package and there is some mechanism in place to prevent the removal. I would suppose other distros/OS'es will have something similar, maybe not by default but after the OS package is installed.

One specific gem in one specific OS feels a bit fragile to me. For the future it might be interesting to better understand this issue and how the OS blocks the removal and either:

  1. ensure blocked-removal gems are installed in different platforms, or
  2. disregard the OS entirely and make the test block a gem and then try to remove it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My brain coming back to this: I think the main issue I have with this is that the condition for that situation is very likely NOT ansible_distribution == "Ubuntu", but rather that gem located the json gem installed in, say, /usr/lib/ruby/..../ and does not: a) have permissions or, b) allows removing from that location at all or, c) does not allow removing removing from that location without some kind of --force parameter

@felixfontein felixfontein added check-before-release PR will be looked at again shortly before release and merged if possible. backport-10 Automatically create a backport for the stable-10 branch backport-11 Automatically create a backport for the stable-10 branch and removed backport-10 Automatically create a backport for the stable-10 branch labels Aug 18, 2025
@ansibullbot ansibullbot added the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR label Aug 18, 2025
@ansibullbot ansibullbot removed the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR label Aug 18, 2025
Comment on lines +337 to +338
command_output = uninstall(module)
if command_output is not None and len(command_output) == 3 and exists(module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uninstall() ends with return module.run_command(), so it should always return the tuple. However, in line 227 above, when in check mode, it returns None.

So, arguably, if command_output is not None then it MUST be a tuple of length 3, so no need to test that.

Copy link
Contributor Author

@gdrosos gdrosos Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove the len check then the unit tests break, see here:
https://github.com/ansible-collections/community.general/actions/runs/17037384348/job/48292931828

(CI of commit d5575c5)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but why? Is this test running in check mode? It doesn't look like. So what is the value of command_output in line 338 for that testcase?

Logic says it should either be None or a tuple of 3. Either my reading of the code is wrong and there is more to it, or something else is broken.

Or, let' s put it this other way, we have 3 cases for command_output:

  1. None
  2. 3-tuple
  3. this X value we are not sure

What should the module do when the result of that command is X? Right now it falls in the common case of the else and declares changed=True and goes to the successful end. Given that it is not clear what is happening, that sounds like a loose-end.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the unit tests, run_module() is patched, but it doesn't return anything specific (probably some automatic mock object?). Simplify adjusting the tests so that patch_run_command() returns a mock that returns (0, '', '') when called should fix this.

if command_output is not None and len(command_output) == 3 and exists(module):
rc, out, err = command_output
module.fail_json(
msg="Gem could not be removed",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also reckon the wording could be a tad more explanatory. Imagine the user running this like they have done before, and now getting a message "Gem could not be removed". But... why? What should they do next? For us, reading this PR, we have some idea of what is going on, but they will not have a clue.

@ansibullbot ansibullbot added the stale_ci CI is older than 7 days, rerun before merging label Aug 31, 2025
@russoz
Copy link
Collaborator

russoz commented Sep 26, 2025

friendly ping @gdrosos are you still working on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-11 Automatically create a backport for the stable-10 branch bug This issue/PR relates to a bug check-before-release PR will be looked at again shortly before release and merged if possible. integration tests/integration module module plugins plugin (any type) stale_ci CI is older than 7 days, rerun before merging tests tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gem: erroneously reports success when uninstalling default gems on Ubuntu
4 participants