-
Notifications
You must be signed in to change notification settings - Fork 244
Draft of tron support in remote-run - proof of concept #4114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm in general
try: | ||
# Load the service deployment settings | ||
deployment_config = load_eks_or_adhoc_deployment_config( | ||
service, instance, cluster, is_toolbox, user | ||
) | ||
except: | ||
# tron | ||
tron = True | ||
deployment_config = load_tron_config(service, instance, cluster) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's abstract this to a function too, so it a bit cleaner
deployment_config, is_tron = load_any_deployment_config(...)
) | ||
return projected_volumes if projected_volumes else None | ||
|
||
def get_kubernetes_service_account_name(self) -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely a whole lot of repetition, but there's probably no other way for it without some huge refactor, so as long as CI is fine with it, I'm fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(we chatted a bit internally in the project channel - just leaving a review here so that GH stops pinging me about this PR)
else: | ||
# Tron dicts use "command" instead of "cmd" and expects an array | ||
deployment_config.config_dict["command"] = ["/usr/bin/sleep", str(max_duration)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunate 😭
this is unfinished, seeking early feedback
Experiment to see what it would take to use Tron jobs with remote-run. To implement this fully in paasta, I updated TronActionConfig with functions get format_as_kubernetes_job working like it does for long running jobs. These were mostly copied from kubernetes_tools with modifications and removals of things that seem unnecessary in this context. If we go with this approach, there's lots of opportunity for refactoring.
There's several TODOs of stuff that I skipped for now like volumes.
This works, but I want feedback before cleaning it up. The main alternative to this is to import task_processing/tron and try to make use of the relevant code there which does something similar.