Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add taskmaster env in FOCA config #203

Closed
wants to merge 2 commits into from
Closed

Conversation

JaeAeich
Copy link

@JaeAeich JaeAeich commented Aug 9, 2024

Summary by Sourcery

Add support for Taskmaster environment properties in the FOCA configuration, including new functions to retrieve custom configuration and Taskmaster template.

New Features:

  • Introduce taskmaster_env_properties in the custom configuration model to manage Taskmaster environment properties.

Enhancements:

  • Add functions to retrieve custom configuration and Taskmaster template from the configuration file.

@JaeAeich JaeAeich requested review from lvarin and uniqueg and removed request for lvarin August 9, 2024 07:15
Copy link

sourcery-ai bot commented Aug 9, 2024

Reviewer's Guide by Sourcery

This pull request adds taskmaster environment properties to the FOCA configuration. It includes new configuration models in tesk/custom_config.py, functions to fetch and utilize these properties in tesk/utils.py, and updates to the configuration file deployment/config.yaml to include the new taskmaster environment settings.

File-Level Changes

Files Changes
tesk/utils.py
tesk/custom_config.py
deployment/config.yaml
Introduced taskmaster environment properties in the FOCA configuration, including new configuration models and functions to fetch and utilize these properties.

Tips
  • Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
  • Continue your discussion with Sourcery by replying directly to review comments.
  • You can change your review settings at any time by accessing your dashboard:
    • Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
    • Change the review language;
  • You can always contact us if you have any questions or feedback.

@JaeAeich
Copy link
Author

JaeAeich commented Aug 9, 2024

wrt #200 I have made changes to some files and structure, specifically previously I was taking in the template from the config as well which is unnecessary, so changed that a bit.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @JaeAeich - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 2 issues found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

) from None


def get_taskmaster_template() -> V1Job:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider refactoring the get_taskmaster_template() function for improved readability

The function is quite long and deals with complex Kubernetes objects. Consider extracting the container configuration into a separate function to improve maintainability.

def get_taskmaster_template() -> V1Job:
    return _create_taskmaster_job()

def _create_taskmaster_job() -> V1Job:
    # Implementation details here
    pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with the general thrust of this comment (although not necessarily with the suggestion for improvement).

Isn't there a way to avoid the hardcoding of all these values? Couldn't you define the models such that you can simply instantiate the outermost model and all those defaults are set?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your input, @uniqueg. You raise a valid point about avoiding hardcoding values. One approach to achieve this could be to use configuration files or environment variables to set these defaults, which can then be loaded into the models. This would make the code more flexible and easier to maintain.

For example, you could define a configuration file (e.g., taskmaster_config.yaml) with all the necessary default values and then load this configuration when creating the V1Job object. This way, you can easily update the configuration without changing the code.

Here's a rough outline of how you might implement this:

  1. Define the configuration file:
# taskmaster_config.yaml
job:
  apiVersion: "batch/v1"
  kind: "Job"
  metadata:
    name: "taskmaster-job"
  spec:
    template:
      spec:
        containers:
          - name: "taskmaster-container"
            image: "taskmaster-image"
            # Add other default values here
  1. Load the configuration in your function:
import yaml
from kubernetes.client import V1Job

def load_taskmaster_config() -> dict:
    with open("path/to/taskmaster_config.yaml", "r") as file:
        return yaml.safe_load(file)

def get_taskmaster_template() -> V1Job:
    config = load_taskmaster_config()
    return V1Job(**config)

This approach separates the configuration from the code, making it easier to manage and update. Let me know if this aligns with your thoughts or if you have any other suggestions!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like this...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad to hear that the suggestion aligns with your thoughts, @uniqueg. If you decide to go with this approach, it will indeed make the code more maintainable and flexible.

To further assist, here are a few additional tips:

  1. Validation: Ensure that the configuration file is validated against a schema to catch any errors early. You can use libraries like pydantic or cerberus for this purpose.

  2. Environment Variables: For sensitive information or values that might change between environments (e.g., development, staging, production), consider using environment variables. You can use a library like python-dotenv to load these variables.

  3. Documentation: Document the configuration file format and the expected environment variables. This will help other developers understand how to configure the system.

  4. Fallbacks: Implement sensible defaults in your code as a fallback in case the configuration file or environment variables are missing or incomplete.

Here's a more refined example incorporating these tips:

import os
import yaml
from kubernetes.client import V1Job
from pydantic import BaseModel, ValidationError

class TaskmasterConfig(BaseModel):
    apiVersion: str
    kind: str
    metadata: dict
    spec: dict

def load_taskmaster_config() -> dict:
    config_path = os.getenv("TASKMASTER_CONFIG_PATH", "path/to/taskmaster_config.yaml")
    with open(config_path, "r") as file:
        return yaml.safe_load(file)

def get_taskmaster_template() -> V1Job:
    try:
        config_dict = load_taskmaster_config()
        config = TaskmasterConfig(**config_dict)
        return V1Job(**config.dict())
    except (FileNotFoundError, ValidationError) as e:
        raise ConfigNotFoundError(f"Error loading Taskmaster configuration: {e}")

# Example usage
if __name__ == "__main__":
    job_template = get_taskmaster_template()
    print(job_template)

This example uses pydantic for validation and includes a fallback for the configuration file path using an environment variable. Let me know if you need any further assistance or have additional questions!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification, @JaeAeich. Your points are well taken.

Given that the template is meant to be changed during runtime without external changes, it makes sense to avoid hardcoding values directly in the configuration file. However, we can still aim to improve the readability and maintainability of the get_taskmaster_template() function.

One approach could be to modularize the function by breaking it down into smaller helper functions. This way, you can keep the flexibility of runtime changes while improving the code structure. Here's an example:

def get_taskmaster_template() -> V1Job:
    return V1Job(
        api_version="batch/v1",
        kind="Job",
        metadata=_get_metadata(),
        spec=_get_spec()
    )

def _get_metadata() -> dict:
    return {
        "name": "taskmaster-job",
        # Add other metadata fields here
    }

def _get_spec() -> dict:
    return {
        "template": {
            "spec": {
                "containers": [_get_container()],
                # Add other spec fields here
            }
        }
    }

def _get_container() -> dict:
    return {
        "name": "taskmaster-container",
        "image": "taskmaster-image",
        # Add other container fields here
    }

This approach keeps the configuration within the code but organizes it into smaller, more manageable pieces. It also allows for easy updates and modifications to individual parts of the template without affecting the entire function.

Let me know if this approach aligns with your requirements or if you have any other suggestions!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, but IMO this is not ideal for this case as we will never need other func even if we did people would usually prefer to change the config like its usually written in manifest.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback, @JaeAeich. I understand your concern about the practicality of breaking down the function into smaller parts, especially if the configuration is typically managed in a manifest-like format.

Given this context, it might be best to keep the function as a single unit but ensure it is well-documented and clearly structured for readability. Here’s a refined approach that maintains the function's integrity while improving readability:

def get_taskmaster_template() -> V1Job:
    """Create and return a Kubernetes Job template for Taskmaster.

    Returns:
        V1Job: A Kubernetes Job object configured for Taskmaster.
    """
    return V1Job(
        api_version="batch/v1",
        kind="Job",
        metadata={
            "name": "taskmaster-job",
            # Add other metadata fields here
        },
        spec={
            "template": {
                "spec": {
                    "containers": [
                        {
                            "name": "taskmaster-container",
                            "image": "taskmaster-image",
                            # Add other container fields here
                        }
                    ],
                    # Add other spec fields here
                }
            }
        }
    )

This approach keeps the function concise and self-contained, while the docstring provides clarity on its purpose and return value. Additionally, you can add inline comments to guide future developers on where to make changes.

Let me know if this strikes a balance between maintainability and practicality, or if you have any other suggestions!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit difficult to reason about this because I don't know how this is used later on. What will you do with that template? It looks like this is set up to make sure that the returned template is a valid job that can be executed (under certain circumstances, e.g., if an FTP instance is available). But if that's the case, it seems to me that this is something that should rather be set and used in tests (e.g., via a fixture).

If it's used to get a job template and then replace the values according to the actual needs (because it's easier to change values on an existing template), I don't understand why, e.g., env needs to be populated at this point (if it's overridden anyway). In other words, we then don't need to make sure the template is valid.

Moreover, if you really want to make it easier to set job params according to needs, why not create a class with setter methods for anything that may possibly be set by the job and user needs.

But again, a description and example of the use case of that function would help to give more specific feedback.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your insightful questions, @uniqueg. Understanding the use case and context of the get_taskmaster_template() function is indeed crucial for providing more specific feedback.

To clarify, the get_taskmaster_template() function is designed to generate a Kubernetes Job template that can be customized and executed based on runtime requirements. The primary goal is to provide a base template that can be easily modified to fit specific needs without having to redefine the entire structure each time.

Here are a few points to address your concerns:

  1. Use Case: The function is intended to provide a starting point for creating Kubernetes Jobs. It ensures that the basic structure is in place, and developers can then modify specific fields as needed. This approach helps maintain consistency and reduces the risk of errors when defining job templates.

  2. Template Modification: While the function provides a complete and valid job template, it is expected that certain fields (like env) will be overridden based on the specific requirements of the job. The idea is to have a flexible yet consistent base that can be adapted as needed.

  3. Class with Setter Methods: Creating a class with setter methods is a great idea for managing job parameters. This would provide a more structured and object-oriented approach to modifying the job template. Here's a rough example of how this could be implemented:

    from kubernetes.client import V1Job
    
    class TaskmasterJobTemplate:
        def __init__(self):
            self.job = V1Job(
                api_version="batch/v1",
                kind="Job",
                metadata={
                    "name": "taskmaster-job",
                    # Add other metadata fields here
                },
                spec={
                    "template": {
                        "spec": {
                            "containers": [
                                {
                                    "name": "taskmaster-container",
                                    "image": "taskmaster-image",
                                    # Add other container fields here
                                }
                            ],
                            # Add other spec fields here
                        }
                    }
                }
            )
    
        def set_container_image(self, image: str):
            self.job.spec['template']['spec']['containers'][0]['image'] = image
    
        def set_env_vars(self, env_vars: list):
            self.job.spec['template']['spec']['containers'][0]['env'] = env_vars
    
        def get_job(self) -> V1Job:
            return self.job
    
    # Example usage
    if __name__ == "__main__":
        job_template = TaskmasterJobTemplate()
        job_template.set_container_image("new-taskmaster-image")
        job_template.set_env_vars([{"name": "ENV_VAR", "value": "value"}])
        job = job_template.get_job()
        print(job)

This class-based approach allows for more granular control over the job template and makes it easier to update specific fields without modifying the entire structure.

  1. Testing: If the primary use of this function is for testing purposes, it might be beneficial to create fixtures or mock objects that can be used in tests. This would ensure that the job template is valid and can be easily modified for different test scenarios.

I hope this provides a clearer picture of the function's purpose and potential improvements. Let me know if you have any further questions or if there's anything else you'd like to discuss!

from tesk.constants import TeskConstants


class FtpConfig(BaseModel):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider flattening the configuration structure to reduce complexity.

The new code introduces additional complexity with three new classes (FtpConfig, ExecutorSecret, and TaskmasterEnvProperties). This increases the cognitive load and makes the configuration harder to understand at a glance. Consider flattening the structure to reduce the number of classes and simplify the configuration. Here’s a simplified version:

from typing import Dict, Optional
from pydantic import BaseModel, Field
from tesk.api.ga4gh.tes.models import Service
from tesk.constants import TeskConstants

class CustomConfig(BaseModel):
    """Custom configuration model for the FOCA app."""

    service_info: Service
    ftp_secret_name: Optional[str] = Field(
        default=None, description="Name of the secret with FTP account credentials"
    )
    ftp_enabled: bool = Field(
        default=False,
        description="If FTP account enabled (based on non-emptiness of secretName)",
    )
    executor_secret_name: Optional[str] = Field(
        default=None,
        description="Name of a secret that will be mounted as volume to each executor. The same name will be used for the secret and the volume",
    )
    executor_secret_mount_path: Optional[str] = Field(
        default=None,
        description="The path where the secret will be mounted to executors",
    )
    executor_secret_enabled: bool = Field(
        default=False, description="Indicates whether the secret is enabled"
    )
    taskmaster_image_name: str = Field(
        default=TeskConstants.taskmaster_image_name,
        description="Taskmaster image name",
    )
    taskmaster_image_version: str = Field(
        default=TeskConstants.taskmaster_image_version,
        description="Taskmaster image version",
    )
    filer_image_name: str = Field(
        default=TeskConstants.filer_image_name, description="Filer image name"
    )
    filer_image_version: str = Field(
        default=TeskConstants.filer_image_version, description="Filer image version"
    )
    debug: bool = Field(
        default=False,
        description="If verbose (debug) mode of taskmaster is on (passes additional flag to taskmaster and sets image pull policy to Always)",
    )
    environment: Optional[Dict[str, str]] = Field(
        default=None,
        description="Environment variables, that will be passed to taskmaster",
    )
    service_account_name: str = Field(
        default="default", description="Service Account name for taskmaster"
    )

This approach maintains the necessary configuration options while reducing the overall complexity.

@@ -3,6 +3,27 @@
import os
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider simplifying the new code to reduce complexity.

The new code introduces several complexities that could be simplified. Here are some suggestions:

  1. Imports: The increased number of imports adds cognitive load. Consider reducing them to only what's necessary.
  2. Functions: While modularity is good, the additional functions (get_custom_config, get_taskmaster_template, get_taskmaster_env_property) introduce new concepts and dependencies. Simplifying these could make the code easier to understand.
  3. Error Handling: Custom error handling (ConfigNotFoundError) adds another layer of complexity. Simplifying error handling can make the code more maintainable.
  4. Configuration Management: The custom configuration management system (Foca, CustomConfig) adds abstraction layers. Simplifying configuration retrieval can help.
  5. Kubernetes API Objects: Constructing Kubernetes API objects directly in the code is complex. Simplifying this part can make the code more readable.

Here's a simplified version of the code that focuses on core functionality and minimizes new concepts:

import os
from pathlib import Path
from kubernetes.client.models import V1Job, V1JobSpec, V1ObjectMeta, V1PodSpec, V1PodTemplateSpec, V1Container, V1EnvVar, V1EnvVarSource, V1SecretKeySelector, V1VolumeMount

def get_config_path() -> Path:
    config_path_env = os.getenv("TESK_FOCA_CONFIG_PATH")
    if config_path_env:
        return Path(config_path_env).resolve()
    else:
        return (Path(__file__).parents[1] / "deployment" / "config.yaml").resolve()

def get_taskmaster_template() -> V1Job:
    return V1Job(
        api_version="batch/v1",
        kind="Job",
        metadata=V1ObjectMeta(name="taskmaster", labels={"app": "taskmaster"}),
        spec=V1JobSpec(
            template=V1PodTemplateSpec(
                metadata=V1ObjectMeta(name="taskmaster"),
                spec=V1PodSpec(
                    service_account_name="default",
                    containers=[
                        V1Container(
                            name="taskmaster",
                            image="docker.io/elixircloud/tesk-core-taskmaster:v0.10.2",
                            args=["-f", "/jsoninput/taskmaster_input.gz"],
                            env=[
                                V1EnvVar(
                                    name="FTP_SECRET_USERNAME",
                                    value_from=V1EnvVarSource(
                                        secret_key_ref=V1SecretKeySelector(
                                            name="ftp-secret",
                                            key="username",
                                            optional=True,
                                        )
                                    ),
                                ),
                                V1EnvVar(
                                    name="FTP_SECRET_PASSWORD",
                                    value_from=V1EnvVarSource(
                                        secret_key_ref=V1SecretKeySelector(
                                            name="ftp-secret",
                                            key="password",
                                            optional=True,
                                        )
                                    ),
                                ),
                            ],
                            volume_mounts=[
                                V1VolumeMount(
                                    name="podinfo",
                                    mount_path="/podinfo",
                                    read_only=True,
                                ),
                                V1VolumeMount(
                                    name="jsoninput",
                                    mount_path="/jsoninput",
                                    read_only=True,
                                ),
                            ],
                        )
                    ],
                    volumes=[],
                    restart_policy="Never",
                ),
            )
        ),
    )

This version reduces complexity and focuses on the core functionality, making it easier to read and maintain.

Returns:
The taskmaster template.
"""
job = V1Job(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)

tesk/utils.py Outdated Show resolved Hide resolved
) from None


def get_taskmaster_template() -> V1Job:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like this...

tesk/utils.py Show resolved Hide resolved
@JaeAeich
Copy link
Author

JaeAeich commented Aug 9, 2024

@uniqueg what I think I should rather do is pu the get_taskmaster_template in tesk/api/ga4gh/kubernetes/utils.py than tesk/utils.py

or better yet in tesk/api/ga4gh/kubernetes/constants.py !

Copy link
Member

@uniqueg uniqueg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a deeper look now that I'm on my laptop. I'm not sure about this PR. It seems to me that there is work to do on the config. I don't see why would have multiple places where things can be configured and where defaults are defined. Plus there is not a strict distinction between constants and config params.

And then there's the get_taskmaster_template() utilitiy function that I'm not sure how it's going to be used.

Perhaps it might be better to only deal with cleaning up and streamlining config management and constants (tesk.custom_config, tesk.constants.TeskConstants, tesk.api.kubernetes.constants.Constants, tesk.api.kubernetes.constants.K8sConstants could be merged into just tesk.constants and tesk.custom_config, IMO, with all of the config params being made accessible for modification via deployment/config.yaml to have everything in one place).

Then add the utility functions in future PRs as and when they are actually used by the application code - it's otherwise impossible to properly review them.

Comment on lines 120 to +121
storeLogs:
execution_trace: True
execution_trace: True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to custom config and document via schema? Or alternatively, this may be something we could add to FOCA's logging config.

Comment on lines +43 to +75
class TaskmasterEnvProperties(BaseModel):
"""Taskmaster environment properties model for the TESK."""

imageName: str = Field(
default=TeskConstants.taskmaster_image_name,
description="Taskmaster image name",
)
imageVersion: str = Field(
default=TeskConstants.taskmaster_image_version,
description="Taskmaster image version",
)
filerImageName: str = Field(
default=TeskConstants.filer_image_name, description="Filer image name"
)
filerImageVersion: str = Field(
default=TeskConstants.filer_image_version, description="Filer image version"
)
ftp: FtpConfig = Field(default=None, description="Test FTP account settings")
debug: bool = Field(
default=False,
description="If verbose (debug) mode of taskmaster is on (passes additional "
"flag to taskmaster and sets image pull policy to Always)",
)
environment: Optional[Dict[str, str]] = Field(
default=None,
description="Environment variables, that will be passed to taskmaster",
)
serviceAccountName: str = Field(
default="default", description="Service Account name for taskmaster"
)
executorSecret: Optional[ExecutorSecret] = Field(
default=None, description="Executor secret configuration"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be structured better?

We could have an object images that holds taskmaster and filer, and each of these are instances of the same model Image with properties name and version.

Similarly, we could have an object secrets that holds executor, storage (with ftp, s3 and so on).

filerImageVersion: str = Field(
default=TeskConstants.filer_image_version, description="Filer image version"
)
ftp: FtpConfig = Field(default=None, description="Test FTP account settings")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we want to read from (or write to) multiple FTP storage instances?

Comment on lines +73 to +75
executorSecret: Optional[ExecutorSecret] = Field(
default=None, description="Executor secret configuration"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be cases where it would make sense to support multiple executor secrets?

)


class TaskmasterEnvProperties(BaseModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply name this Taskmaster?

Comment on lines +17 to +21
enabled: bool = Field(
default=False,
description="If FTP account enabled (based on non-emptiness of secretName)",
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this

based on non-emptiness of secretName

Does this imply that this is automatically set based on the value of secretName? If so, why do we need it?

More generally, why do we need a mechanism to disable/enable a secret when we could just remove the secret? Goes for the executorSecret, too.

from tesk.constants import TeskConstants


class FtpConfig(BaseModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The properties of this model don't appear to me to be specific to FTP (only the descriptions). Could this then be abstracted to just Secret maybe?


from tesk.api.ga4gh.tes.models import Service
from tesk.constants import TeskConstants
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are defaults, not constants, right? Because the user can change them via the config. So perhaps better to rename the module defaults and the

Also, if they were constants, note that it is Python convention to use ALL_CAPS names.

Finally, I'm still not sure why we don't keep the defaults where we define the models, because that's where we also keep the documentation of the corresponding params.

V1VolumeMount,
)

from tesk.api.kubernetes.constants import Constants, K8sConstants
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much the same comment applies to these as to TeskConstants. Not all of these appear to me to be constants, but might be changed by the user/admin, in which case I believe that they are config params and should be called as such.

Some of them might indeed be constants (TESK would fail if they were ever changed). In that case, the var names should be in ALL_CAPS.

Possibly it would make sense to define:

  1. A class Constants with ALL_CAPS attributes that should never change. Alternatively, perhaps better/more idiomatic, they could also be globals in a module constants.
  2. A class K8sConfig that is defined in CustomConfig and can be configured via deployment/config.yaml, perhaps in a section custom.kubernetes (just like all the other configurable stuff).
  3. One or more classes (e.g., JobConfig) to define the remaining config params. Also defined in CustomConfig and configurable by users/admins through sections in custom in deployment/config.yaml.

) from None


def get_taskmaster_template() -> V1Job:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit difficult to reason about this because I don't know how this is used later on. What will you do with that template? It looks like this is set up to make sure that the returned template is a valid job that can be executed (under certain circumstances, e.g., if an FTP instance is available). But if that's the case, it seems to me that this is something that should rather be set and used in tests (e.g., via a fixture).

If it's used to get a job template and then replace the values according to the actual needs (because it's easier to change values on an existing template), I don't understand why, e.g., env needs to be populated at this point (if it's overridden anyway). In other words, we then don't need to make sure the template is valid.

Moreover, if you really want to make it easier to set job params according to needs, why not create a class with setter methods for anything that may possibly be set by the job and user needs.

But again, a description and example of the use case of that function would help to give more specific feedback.

@JaeAeich
Copy link
Author

Had a deeper look now that I'm on my laptop. I'm not sure about this PR.

Yeah now that I have move context too I see where I can improve, ill create another PR keeping all the reviews from here. Not closing for that reason.

@JaeAeich
Copy link
Author

JaeAeich commented Sep 6, 2024

closing in favor of #206

@JaeAeich JaeAeich closed this Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants