Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to impose size/cardinality limits on more fields #1655

Open
krancour opened this issue Oct 5, 2021 · 2 comments
Open

Need to impose size/cardinality limits on more fields #1655

krancour opened this issue Oct 5, 2021 · 2 comments
Assignees

Comments

@krancour
Copy link
Contributor

krancour commented Oct 5, 2021

Some fields in projects , events, and jobs have unbounded length (or in the case of maps/arrays, cardinality).

We should impose some reasonable (TBD) limits-- or at the very least return a helpful error messages when we bump into hard limits on the size of underlying k8s resources.

Rationale:

Workers discover relevant event details, which include event metadata (source, type, qualifiers, labels, etc.) embedded scripts / config (if applicable), project name, project secrets (if applicable), event payload (if applicable), and a token for communicating with the Brigade API server from a chunk of JSON that is written to a k8s Secret and mounted to the worker's file system.

Jobs also each get their own secret that contains any/all environment variables used by the job.

Secrets have a hard size limit of 1MiB.

Some fields whose size we might consider limiting:

  • workerTemplate.defaultConfigFiles on a project definition. Enforce a max number of files, each with a max size? Or enforce a cumulative max size?
  • qualifiers on an event-- enforce a max number of qualifiers?
  • labels on an event-- enforce a max number of labels?
  • payload on an event
  • env vars on a job-- enforce a max number of env vars, each with a max size? Or enforce a cumulative max size?

I'm sure there are many more.

I think this could get complicated quickly. Figuring out reasonable limits on map/array cardinality or limits on individual field lengths so they add up to a certain max size sounds tricky. And if we work it all out and don't leave room for expansion (new fields) someday? Then what?

There are also a lot of permutations here. Should someone be able to embed a larger script if their payloads are always small? Larger payloads if their scripts are small?

So, instead of putting hard limits on the cardinality of different data structures or the length of individual fields, maybe we should consider simply returning appropriate error codes and detailed messages when Brigade itself bumps into the limits imposed by k8s?

@krancour krancour added this to the 2.0.0-beta.4 milestone Oct 5, 2021
@krancour krancour self-assigned this Oct 5, 2021
@krancour krancour changed the title Need to impose size limits on more fields Need to impose size/cardinality limits on more fields Oct 5, 2021
@vdice
Copy link
Contributor

vdice commented Oct 6, 2021

So, instead of putting hard limits on the cardinality of different data structures or the length of individual fields, maybe we should consider simply returning appropriate error codes and detailed messages when Brigade itself bumps into the limits imposed by k8s?

I'm leaning towards this option for the near future.

@krancour
Copy link
Contributor Author

krancour commented Nov 8, 2021

@vdice and I have discussed this and decided not to let this issue hold up beta.4 / rc.1 / GA.

Due to the asynchronous nature of Brigade and a deliberate effort to not permit secrets to pile up in k8s because of a deleterious effect on cluster performance, worker-specific and job-specific k8s secrets are created just-in-time. i.e. Right before the corresponding pod is created and launched on the cluster.

Again, due to the asynchronous nature of things, no user (or service account) is there to "see" the resulting error. As things stand currently, things should fail, but they'll fail with very little information about why.

The resolution we've agreed on is to introduce a PhaseReason field to both WorkerStatus and JobStatus types. This will allow Brigade components to communicate additional context about why a worker or job has ended up in a particular state.

We feel this is a large change that we don't want to undertake at the 11th hour, but we also feel it is not a breaking change, and can therefore be, reasonably, be deferred until a 2.x release-- tentatively 2.1.

@krancour krancour added 2.1 and removed 2.0 labels Nov 8, 2021
@krancour krancour removed this from the 2.0.0-beta.4 milestone Nov 8, 2021
@krancour krancour removed the 2.1 label Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants