-
Couldn't load subscription status.
- Fork 11
Safely invoke commands over SSH #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It is common misunderstanding that SSH accepts CMD [ARGS...]. Actually it joins all positional arguments and passes the resulting single string to sh -c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Used GitHub's review interface to post comments to the lines changed, hopefully it will handle this well)
I noticed in the alternative PR #20 that this has been already dealt with by quoting arguments that were observed to contain spaces in past. Note that this broke the scenario where the same arguments were used with the run() function instead of ssh() (seems no one has been using it this way for couple of months at least). Fixing that I also noticed the same issue duplicated in tester.py, so extended the commit.
| print "trying to get any test results" | ||
| self.commands.scpfrom("/tmp/results/*", self.results_dir) | ||
| self.commands.ssh(['rm', '-rf', '/tmp/results/*']) | ||
| self.commands.ssh(['sh', '-c', 'rm -rf /tmp/results/*']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice how the original arguments differ from the invocation of run() couple of lines above. Now ssh() and run() can be used equally just like popen with list of string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confuses me. We end up with
sh -c sh -c "'rm -rf /tmp/results/*'"
do we need a double sh -c to undo a pipes.quote level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it will be
sh -c "sh -c 'rm -rf /tmp/results/*'"
We need it to get glob patterns resolved or more generally for some shell functionality.
Saying "to undo a pipes.quote level" is basically true but uncovers an implementation detail that the user of ssh() shouldn't care about. The user of ssh() needs to know that ssh() accepts a list of arguments to be directly stuffed into argv and nothing more. The user of ssh() doesn't care about unquotting; the user of ssh() sees the need for some shell functionality, so the command he executes is a shell interpreter with a shell script supplied as an argument.
This is well illistrated with the run() invocations above on lines 586 and 587 which I reffered to in my original comment. First invocation (cp -v) needs some shell functionality so it executes sh -c with the actuall command supplied in form of a shell script. Second invocation (rm -rf) has all arguments ready to be directly stuffed to argv.
Note that ideally the first invocation would have every non-literal input quoted as in
self.commands.run(['sh', '-c', "cp -v /tmp/" + pipes.quote(self.test_id) + "/results/* " + pipes.quote(self.results_dir)])
Thinking of a case when a malicious test_id is supplied by an attacker, e.g. 'foobar; rm -rf /'.
| elif self.ict == "mic": | ||
| mic_comm.append('%s' % job_args.image_type) | ||
| mic_comm.append('"%s"' % job_args.ksfile_name) | ||
| mic_comm.append('%s' % job_args.ksfile_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice how mic_comm is passed either to ssh() or run() below, based on some condition. The extra quotes fixes the original ssh() invocation but otoh would break the run() invocation.
|
|
||
| if " " in tokenvalue: | ||
| tokenvalue = '"%s"' % tokenvalue | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this would broke the local execution as commented before.
Right, I never considered the run() case. If they were/are indeed meant to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other areas to look at:
- ImageTest.update_vm invokes
addrepo_comm.extend([reponame, '%s' % repo])
self.commands.ssh(addrepo_comm)
- ImageTester.install_tests invokes
addrepo_comm.extend(['testtools', '%s' % self.testtools_repourl])
self.commands.ssh(addrepo_comm)
- ssh() calls run() calls self.run(kill_comm), kill_comm being `pkill -f "
".join(command)`, does this all keep working correctly?
- are there other parts of mic_comm that are pre-escaped?
In theory we can get old data when a job is resubmitted, not sure
we should worry about that.
I'd also argue we should test this heavily, this change has more than usual
risk of breaking image builds.
| #~ along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
|
||
| import os, sys | ||
| import pipes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we start caring about Python3 compatibility? I.e:
try:
from pipes import quote
except ImportError:
# Moved in Python3.3
from shlex import quote
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to port it as a whole instead (in a single step).
| print "trying to get any test results" | ||
| self.commands.scpfrom("/tmp/results/*", self.results_dir) | ||
| self.commands.ssh(['rm', '-rf', '/tmp/results/*']) | ||
| self.commands.ssh(['sh', '-c', 'rm -rf /tmp/results/*']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confuses me. We end up with
sh -c sh -c "'rm -rf /tmp/results/*'"
do we need a double sh -c to undo a pipes.quote level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other areas to look at:
[...]
addrepo_comm.extend([reponame, '%s' % repo])
I would say these are equal to pass the variable directly and the only reason why the percent substitution is used is to keep similar code visually aligned.
- ssh() calls run() calls self.run(kill_comm), kill_comm being
pkill -f " ".join(command), does this all keep working correctly?
Yes IIUC it will work equally.
BTW, thinking why is this pkill required at all when it is followed by proc.terminate() which sends SIGTERM to the process selected by stored PID. It seems it is only needed to properly kill qemu-kvm which is executed with '-daemonize' (forking again). Does the '-daemonize' option have any other effect for which it is needed? If not then omitting the '-daemonize' option should make the pkill useless.
are there other parts of mic_comm that are pre-escaped?
In theory we can get old data when a job is resubmitted, not sure we should worry about that.I'd also argue we should test this heavily, this change has more than usual risk of breaking image builds.
This change is currently needed just for SDK builds, i.e., reverting this change only affects SDK builds. So unless it takes too much time to (re)deploy imager I suggest to simply test this in production environment.
| print "trying to get any test results" | ||
| self.commands.scpfrom("/tmp/results/*", self.results_dir) | ||
| self.commands.ssh(['rm', '-rf', '/tmp/results/*']) | ||
| self.commands.ssh(['sh', '-c', 'rm -rf /tmp/results/*']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it will be
sh -c "sh -c 'rm -rf /tmp/results/*'"
We need it to get glob patterns resolved or more generally for some shell functionality.
Saying "to undo a pipes.quote level" is basically true but uncovers an implementation detail that the user of ssh() shouldn't care about. The user of ssh() needs to know that ssh() accepts a list of arguments to be directly stuffed into argv and nothing more. The user of ssh() doesn't care about unquotting; the user of ssh() sees the need for some shell functionality, so the command he executes is a shell interpreter with a shell script supplied as an argument.
This is well illistrated with the run() invocations above on lines 586 and 587 which I reffered to in my original comment. First invocation (cp -v) needs some shell functionality so it executes sh -c with the actuall command supplied in form of a shell script. Second invocation (rm -rf) has all arguments ready to be directly stuffed to argv.
Note that ideally the first invocation would have every non-literal input quoted as in
self.commands.run(['sh', '-c', "cp -v /tmp/" + pipes.quote(self.test_id) + "/results/* " + pipes.quote(self.results_dir)])
Thinking of a case when a malicious test_id is supplied by an attacker, e.g. 'foobar; rm -rf /'.
| #~ along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
|
||
| import os, sys | ||
| import pipes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to port it as a whole instead (in a single step).
It is common misunderstanding that SSH accepts CMD [ARGS...]. Actually
it joins all positional arguments and passes the resulting single string
to sh -c.