Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows ssh support by ipcluster #842

Open
wants to merge 79 commits into
base: main
Choose a base branch
from

Conversation

ottointhesky
Copy link

As discussed in the issue 837 I have implemented an abstraction layer (ipyparallel.cluster.shellcmd.ShellCommandSend and ...ShellCommandReceive) for sending shell commands in a platform independent manner. The API provides a variety of different standard shell commands (start, running, kill, mkdir, rmdir, exists and remove) and some generic shell/python code command execution features which encapsulates necessary quoting and different handling of different platforms and shells. Currently, Linux (bash, ssh) and Windows (cmd, powershell, ssh) are supported but it should be easy to support MacOS as well (I will have a look after sending the pull request).

The abstraction layer was mainly developed to send shell command through an ssh connection but it works equally for sending commands to a local shell. All possible types of shell connections are test in the corresponding pytest script (test_shellcmd.py)

I have extended the ssh-cluster-test that it also runs under Windows, BUT unfortunately there is an issue that prevents the ssh-cluster-test (and test_shellcmd) to run successfully in Github Actions. Everything works perfectly in a local environment, but with Github Runners it is not possible to start a process with the CREATE_BREAKAWAY_FROM_JOB flag set (the flag is absolute necessary, otherwise the started process gets kill when the ssh connection is closed). I’m not the first one that came across this issue, but obviously there is no easy work-a-round or solution to this (Have a look at win-start_issue.py which shows the Github behaviour in the corresponding test )

I will keep trying to find a trick or work-a-round to that issue, but I fear there isn’t. Not testing ssh-cluster under Windows isn’t an option (from my point of view), so maybe a self-hosted Windows runner could be the solution. Do you have experience with that? How should we proceed?

By the way, shellcmd implementation also has a cli ipyparallel.shellcmd which is NOT used. Hence, it could be safely dropped…

Johannes Otepka and others added 20 commits June 3, 2024 16:45
code refactoring: simplifcation

Co-authored-by: Min RK <[email protected]>
* use_break_way renamed to use_breakaway
* comparison with True or False removed
* minor spelling corrections in comments
* str format changes demanded by pre-commit
* avoid os.system calls
* breakaway docu section added
* minor code refactoring
* avoid os.system calls
* breakaway docu section added
* minor code refactoring
* only support str to str dicts as env for cmd_start
* check added that env dict is only str to str
* necessary adaption in test_shellcmd.py
* start_cmd has to be a list of string (check added/automatic str conversion removed)
@ottointhesky
Copy link
Author

I think we want to be as insensitive to command-line parsing and splitting as we can, so when we're invoking a remote python command, we should be serializing things in a more rigorous way, like a single JSON object for input. Getting a single argument serialized and parsed correctly, rathe than a general utility for parsing lists of args, some of which may contain command lists to split, some may be single arguments, etc.

That way we should only have to handle the platform sensitivity of serialization and parsing exactly once in exactly one place, and the rest should be serialized "correctly" as Python lists and dicts that get passed to Popen, etc. entirely unmodified on the receiving side.

I agree and your suggestions led to a straightening of the code. Now, there is only a minimal set of parsing action left in ShellCommandSend._cmd_send.
For completeness it should by mentioned that function ShellCommandSend._cmd_send_via_shell is obsolete. It represents my first approach of sending commands before switching to python commands in all cases. I got it working at the end, but I’m pretty sure that in some strange input cases it would fail. Nevertheless, ShellCommandSend._cmd_send_via_shell has the advantage that is doesn’t reply on a python installation on the other receiver side which is probably irrelevant for the ipyparallel project. So should I remove it?

ShellCommandSend._cmd_send, on the hand, should perfectly work in arbitrary complex parameter and shell cases.
In my point of view ShellCommandSend is now pretty straight forward. Only the annoying breakaway issue in the github runner make things a bit complicated. Should I put the cmd == 'start' and self.breakaway_support is False if branch code into a separated function?

return Platform.Linux
elif tmp == "darwin":
return Platform.MacOS
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash on freebsd, etc. I don't think we have any platform-specific logic except "Windows" and "not Windows", so that's what we should capture. Instead of raising on unrecognized platforms, we should assume posix.

As a result, I think we can simplify this enum class to a single WINDOWS boolean, or if you like enum WINDOWS and POSIX, where we assume POSIX if not WINDOWS.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your are right - didn't work on freebsd.
A classification in Windows and Posix would definitely work. Personally, MacOS feels so much differently to me which is why I would keep the MacOS enum. Nevertheless, it seems that MacOS is very close to an
posix system so final I leave the final decision up to you. Dual enum?

platform = Platform.get()
if platform == Platform.Windows:
userdir = os.environ["USERPROFILE"]
elif platform == Platform.Linux:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to special-case linux:

Suggested change
elif platform == Platform.Linux:
else:

elif platform == Platform.Linux:
userdir = os.environ["HOME"] if "HOME" in os.environ else ""
self.filename = filename.replace(
"${userdir}", userdir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be the more standard ${HOME} and use Path.home(), which already handles USERPROFILE


def __del__(self):
if self.log:
self._log("ShellCommandReceive instance deleted")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__del__ is a notoriously unreliable method that is not called consistently. If there are cleanup operations to do, the object should have an explicit close() method and/or __enter__/__exit__ methods to be used as a context manager.

It's still okay for __del__ to call close if it's left open, but if __del__ is actually what ends up closing the file, that should generally result in a ResourceWarning

def cmd_rmdir(self, path):
self._log(f"Remove directory '{path}'")

import shutil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to module level import

ssh $env:SSH_HOST -p $env:SSH_PORT "pip list"

echo "Check if container can ping the docker host (requires adapted hosts file and firewall disabled)"
docker run ipyparallel-sshd ping -n 1 $env:computername
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this test is so different for Windows, maybe just the new test-ssh.yml should test ssh instead of mixing it in here. That might remove a lot of complicated conditionals.

@@ -1 +0,0 @@
docs/source/examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know why this file was deleted?

# out = check_output(full_cmd, input=None, start_new_session=True).decode(
# "utf8", "replace"
# )
out = self._secure_ssh_sender().check_output_python_code(python_code)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between self.ssh_sender and self._secure_ssh_sender?


if self.platform == Platform.Windows:
# taken from https://stackoverflow.com/questions/568271/how-to-check-if-there-exists-a-process-with-a-given-pid-in-python
import ctypes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just from a code organization standpoint, does it make sense to have a _shellcmd_windows and _shellcmd_posix module with the same API, rather than having all of these imports under if statements. This really is two implementations with ~no code shared, so it might be easier to grasp by splitting it to one file per platform (there are only two platforms).

We can keep the class here, and use functions for the platform-dependent implementations. Or a mixin class, if that appeals to you more.

Seeing how these are transmitted, perhaps separate files would complicate things. But at least separate classes seems to make sense to me. What do you think?

output_template = re.compile(
r"__([a-z][a-z0-9_]+)=([a-z0-9\-\.]+)__", re.IGNORECASE
)
receiver_code = inspect.getsource(ShellCommandReceive)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than extracting snippets, can we make it a whole standalone file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants