-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include multiple files for TrainingClient().create_job() #2233
Comments
Thank you for creating this @u66u! We discuss exactly the same capability with various users. I think, we have two options to solve this problem:
from kubeflow.training import TrainingClient
def submit_to_trainjob():
from my_model import train
train()
TrainingClient().train(
name="my-job",
train_func=submit_to_trainjob,
) Since my_model.py file will be located in the
We are looking for various options on how to distribute the user's training code into TrainJob resources. cc @kubeflow/wg-training-leads @shravan-achar |
/remove-label lifecycle/needs-triage |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it. |
For anyone interested this feature will be implemented as part of this issue: #2347 |
What you would like to be added?
I want to create a base docker image that will be used for training different models, it shouldn't include any training-specific files and serve as a base layer., then use it as base_image in TrainingClient().create_job().
Then I write the training code in my kubeflow notebook or from local machine:
However, train_func just gets its code copied onto the kubernetes cluster, so I can't import anything from my local modules, I can only use pip libraries that are in my base docker image, I also can't import anything from the file that train_func is in:
Unresolved import error in both cases^
Is there any way to include multiple files in TrainingClient().create_job() or TrainingClient().train() without using yaml configs and kubectl and without adding them to my docker image?
Why is this needed?
It allows not having to rebuild your docker image or create yaml configs each time you want to run a new training job
Love this feature?
Give it a 👍 We prioritize the features with most 👍
The text was updated successfully, but these errors were encountered: