Skip to content

Conversation

@martyone
Copy link
Contributor

It is common misunderstanding that SSH accepts CMD [ARGS...]. Actually
it joins all positional arguments and passes the resulting single string
to sh -c.

It is common misunderstanding that SSH accepts CMD [ARGS...]. Actually
it joins all positional arguments and passes the resulting single string
to sh -c.
Copy link
Contributor Author

@martyone martyone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Used GitHub's review interface to post comments to the lines changed, hopefully it will handle this well)

I noticed in the alternative PR #20 that this has been already dealt with by quoting arguments that were observed to contain spaces in past. Note that this broke the scenario where the same arguments were used with the run() function instead of ssh() (seems no one has been using it this way for couple of months at least). Fixing that I also noticed the same issue duplicated in tester.py, so extended the commit.

print "trying to get any test results"
self.commands.scpfrom("/tmp/results/*", self.results_dir)
self.commands.ssh(['rm', '-rf', '/tmp/results/*'])
self.commands.ssh(['sh', '-c', 'rm -rf /tmp/results/*'])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice how the original arguments differ from the invocation of run() couple of lines above. Now ssh() and run() can be used equally just like popen with list of string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confuses me. We end up with

sh -c sh -c "'rm -rf /tmp/results/*'"

do we need a double sh -c to undo a pipes.quote level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it will be

sh -c "sh -c 'rm -rf /tmp/results/*'"

We need it to get glob patterns resolved or more generally for some shell functionality.

Saying "to undo a pipes.quote level" is basically true but uncovers an implementation detail that the user of ssh() shouldn't care about. The user of ssh() needs to know that ssh() accepts a list of arguments to be directly stuffed into argv and nothing more. The user of ssh() doesn't care about unquotting; the user of ssh() sees the need for some shell functionality, so the command he executes is a shell interpreter with a shell script supplied as an argument.

This is well illistrated with the run() invocations above on lines 586 and 587 which I reffered to in my original comment. First invocation (cp -v) needs some shell functionality so it executes sh -c with the actuall command supplied in form of a shell script. Second invocation (rm -rf) has all arguments ready to be directly stuffed to argv.

Note that ideally the first invocation would have every non-literal input quoted as in

self.commands.run(['sh', '-c', "cp -v /tmp/" + pipes.quote(self.test_id) + "/results/* " + pipes.quote(self.results_dir)])

Thinking of a case when a malicious test_id is supplied by an attacker, e.g. 'foobar; rm -rf /'.

elif self.ict == "mic":
mic_comm.append('%s' % job_args.image_type)
mic_comm.append('"%s"' % job_args.ksfile_name)
mic_comm.append('%s' % job_args.ksfile_name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice how mic_comm is passed either to ssh() or run() below, based on some condition. The extra quotes fixes the original ssh() invocation but otoh would break the run() invocation.


if " " in tokenvalue:
tokenvalue = '"%s"' % tokenvalue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this would broke the local execution as commented before.

@larstiq
Copy link
Contributor

larstiq commented Oct 8, 2017

I noticed in the alternative PR #20 that this has been already dealt with by
quoting arguments that were observed to contain spaces in past. Note that this
broke the scenario where the same arguments were used with the run() function
instead of ssh() (seems no one has been using it this way for couple of months
at least). Fixing that I also noticed the same issue duplicated in tester.py,
so extended the commit.

Right, I never considered the run() case. If they were/are indeed meant to be
used with the same api then some of the calling code was broken, as well as the
ssh implementation. Fixing them like this makes sense to me, so let's
continue with this PR making sure we catch all issues.

Copy link
Contributor

@larstiq larstiq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other areas to look at:

- ImageTest.update_vm invokes

        addrepo_comm.extend([reponame, '%s' % repo])
        self.commands.ssh(addrepo_comm)

- ImageTester.install_tests invokes

        addrepo_comm.extend(['testtools', '%s' % self.testtools_repourl])
        self.commands.ssh(addrepo_comm)

- ssh() calls run() calls self.run(kill_comm), kill_comm being `pkill -f "
  ".join(command)`, does this all keep working correctly?


- are there other parts of mic_comm that are pre-escaped?

        In theory we can get old data when a job is resubmitted, not sure
        we should worry about that.

I'd also argue we should test this heavily, this change has more than usual
risk of breaking image builds.

#~ along with this program. If not, see <http://www.gnu.org/licenses/>.

import os, sys
import pipes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we start caring about Python3 compatibility? I.e:

try:
    from pipes import quote
except ImportError:
    # Moved in Python3.3
    from shlex import quote

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to port it as a whole instead (in a single step).

print "trying to get any test results"
self.commands.scpfrom("/tmp/results/*", self.results_dir)
self.commands.ssh(['rm', '-rf', '/tmp/results/*'])
self.commands.ssh(['sh', '-c', 'rm -rf /tmp/results/*'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confuses me. We end up with

sh -c sh -c "'rm -rf /tmp/results/*'"

do we need a double sh -c to undo a pipes.quote level?

Copy link
Contributor Author

@martyone martyone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other areas to look at:
[...]
addrepo_comm.extend([reponame, '%s' % repo])

I would say these are equal to pass the variable directly and the only reason why the percent substitution is used is to keep similar code visually aligned.

  • ssh() calls run() calls self.run(kill_comm), kill_comm being pkill -f " ".join(command), does this all keep working correctly?

Yes IIUC it will work equally.

BTW, thinking why is this pkill required at all when it is followed by proc.terminate() which sends SIGTERM to the process selected by stored PID. It seems it is only needed to properly kill qemu-kvm which is executed with '-daemonize' (forking again). Does the '-daemonize' option have any other effect for which it is needed? If not then omitting the '-daemonize' option should make the pkill useless.

  • are there other parts of mic_comm that are pre-escaped?

      In theory we can get old data when a job is resubmitted, not sure
      we should worry about that.
    

I'd also argue we should test this heavily, this change has more than usual risk of breaking image builds.

This change is currently needed just for SDK builds, i.e., reverting this change only affects SDK builds. So unless it takes too much time to (re)deploy imager I suggest to simply test this in production environment.

print "trying to get any test results"
self.commands.scpfrom("/tmp/results/*", self.results_dir)
self.commands.ssh(['rm', '-rf', '/tmp/results/*'])
self.commands.ssh(['sh', '-c', 'rm -rf /tmp/results/*'])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it will be

sh -c "sh -c 'rm -rf /tmp/results/*'"

We need it to get glob patterns resolved or more generally for some shell functionality.

Saying "to undo a pipes.quote level" is basically true but uncovers an implementation detail that the user of ssh() shouldn't care about. The user of ssh() needs to know that ssh() accepts a list of arguments to be directly stuffed into argv and nothing more. The user of ssh() doesn't care about unquotting; the user of ssh() sees the need for some shell functionality, so the command he executes is a shell interpreter with a shell script supplied as an argument.

This is well illistrated with the run() invocations above on lines 586 and 587 which I reffered to in my original comment. First invocation (cp -v) needs some shell functionality so it executes sh -c with the actuall command supplied in form of a shell script. Second invocation (rm -rf) has all arguments ready to be directly stuffed to argv.

Note that ideally the first invocation would have every non-literal input quoted as in

self.commands.run(['sh', '-c', "cp -v /tmp/" + pipes.quote(self.test_id) + "/results/* " + pipes.quote(self.results_dir)])

Thinking of a case when a malicious test_id is supplied by an attacker, e.g. 'foobar; rm -rf /'.

#~ along with this program. If not, see <http://www.gnu.org/licenses/>.

import os, sys
import pipes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to port it as a whole instead (in a single step).

@xfade xfade merged commit 7117948 into MeeGoIntegration:master Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants