Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Hosted Control Planes for ROSA to speed up cluster creation #748

Merged
merged 2 commits into from
Apr 2, 2024

Conversation

ryanemerson
Copy link
Contributor

@ryanemerson ryanemerson commented Mar 26, 2024

I have removed the multi-az cluster options associated with creating rosa clusters as we haven't been using these and we don't have a need for this functionality, at least in the short term. If there are any objections I can attempt to reinstate these.

Resolves #673
Resolves #750

Remaining actions:

  • Test GH action integration for creating clusters
  • Provision crossdc setup across multiple HCP clusters

@ryanemerson ryanemerson force-pushed the opentofu_hcp branch 9 times, most recently from 375545c to 31f246e Compare March 27, 2024 15:04
@ryanemerson ryanemerson marked this pull request as ready for review March 27, 2024 15:36
@ryanemerson
Copy link
Contributor Author

Multi-az clusters created and deployed as expected on inspection: https://github.com/ryanemerson/keycloak-benchmark/actions/runs/8452406289

@ahus1 ahus1 requested review from ahus1 and mhajas March 27, 2024 17:01
.gitignore Outdated Show resolved Hide resolved
provision/aws/rosa_create_cluster.sh Outdated Show resolved Hide resolved
Comment on lines +69 to +82
variable "openshift_version" {
type = string
default = "4.14.5"
}

variable "instance_type" {
type = string
default = "m5.4xlarge"
}

variable "replicas" {
type = number
default = 2
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: As we now have a good place here for the defaults, I would tend to not keep them in other places and just maintain them here.

So we would have fewer parameters on our workflows. It would also be ok to remove some and keep some from the list if you think we're about to use them now and then.

As this is a nitpick, feel free to keep for another future PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the default values from the GH rosa-cluster-create workflow, but left the inputs so that the default can still be overridden. The rosa_create_cluster.sh script then adds the variables as -var args to tofu if a value is specified.

provision/opentofu/modules/rosa/README.md Outdated Show resolved Hide resolved
@ahus1
Copy link
Contributor

ahus1 commented Mar 27, 2024

@ryanemerson - great job, happy to see the first working cluster created via OpenTofu!

@ryanemerson ryanemerson force-pushed the opentofu_hcp branch 2 times, most recently from 3b17e86 to ff2be5a Compare March 28, 2024 10:05
@ryanemerson
Copy link
Contributor Author

I've addressed all comments and have executed a new GH run to verify: https://github.com/ryanemerson/keycloak-benchmark/actions/runs/8465601858

@ryanemerson
Copy link
Contributor Author

The previous run uncovered an issue with the EFS script. All working as expected now:

https://github.com/ryanemerson/keycloak-benchmark/actions/runs/8466454176

Copy link
Contributor

@ahus1 ahus1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the change, it is good to see it will shave off 2x20 min for building our ROSA clusters.

I'm approving it today, and will merge it on Monday or Tuesday, so we'll not run into trouble over the long weekend.

@ahus1 ahus1 self-assigned this Mar 28, 2024
Copy link
Contributor

@mhajas mhajas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @ryanemerson! This is a huge step forward towards using OpenTofu! I expected creating ROSA clusters would be simpler in tofu though.

Anyway, I think this is ready for merging. I added two comments but even if it will require some changes I am ok to do it later.

}
cd ${SCRIPT_DIR}/../opentofu/modules/rosa/hcp
tofu init
tofu workspace new ${CLUSTER_NAME} || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused why this is here. Is each cluster in a separate workspace? My expectation was there will be a new workspace created in the beginning and then all resources will be created in that workspace. For example, the daily run will have only one workspace. Did I miss something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's possible for us to use a single workspace for two clusters as we're effectively using the same module but with different configurations in order to create the two clusters. I think on executing apply for the second time with a different cluster_name opentofu will attempt to update the first cluster's resources.

This is all new to me as well though, so maybe I'm misunderstanding 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is new to me as well but using the same module more than once should be possible.

For example, I used two modules here: https://github.com/mhajas/keycloak-benchmark/blob/opentofu-poc/provision/opentofu/main.tf#L7-L14

You can also specify aliases for providers and use more provider configurations within one tf file like this:
https://github.com/mhajas/keycloak-benchmark/blob/opentofu-poc/provision/opentofu/infinispan/main.tf#L7
and
https://github.com/mhajas/keycloak-benchmark/blob/opentofu-poc/provision/opentofu/infinispan/main.tf#L92

But as I said, it is ok for me to merge this as is and play with enhancements later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, I used two modules here: https://github.com/mhajas/keycloak-benchmark/blob/opentofu-poc/provision/opentofu/main.tf#L7-L14

I hadn't considered referencing two modules from the root like that 🙂

I think the downside to ^ approach is that we would have to make more changes to our various *.sh scripts and GH actions as we would either need to replace the use of rosa_create_cluster.sh or adapt it to support creating multiple clusters at once.

The flip side is that it would definitely make it easier to create multiple clusters in parallel, as the first stage of the module could be to determine CIDR ranges for both clusters before provisioning them.

opentofu/remote-state/main.tf Outdated Show resolved Hide resolved
- Remove multi-az cluster create options
- Remove need for ccoctl when provisioning EFS as the
  cloud-credential-operator is not installed on HCP clusters

Resolves keycloak#673

Signed-off-by: Ryan Emerson <[email protected]>
@mhajas
Copy link
Contributor

mhajas commented Mar 28, 2024

I just realized one more thing. I probably know the answer as we are using the same script for choosing CIDR but I think it is worth double-checking. Can we run cluster creation in parallel with the new approach? https://github.com/keycloak/keycloak-benchmark/blob/main/.github/workflows/rosa-multi-az-cluster-create.yml#L61-L62

@ahus1 ahus1 merged commit 23965bf into keycloak:main Apr 2, 2024
3 checks passed
@ahus1
Copy link
Contributor

ahus1 commented Apr 2, 2024

Can we run cluster creation in parallel with the new approach?

Not yet, AFAIK, as we're still using the same script. Still it is a worthy next step. At the moment the script checks only the available IP address, but doesn't allocate it. Maybe we can find a mechanism to reserve the CIDR by setting some expiring information somewhere (maybe a S3 bucket entry with a timeout?) No idea what the most lightweight approach would be. cc: @ryanemerson

@ryanemerson
Copy link
Contributor Author

ryanemerson commented Apr 2, 2024

I think the simplest way, i.e. no additional dependencies, to allow parallel creation would be to utilise a terraform module that creates multiple clusters in a single workspace as suggested by @mhajas #748 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants