Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(VPCSC): enable dryrun mode #1210

Merged

Conversation

eeaton
Copy link
Collaborator

@eeaton eeaton commented Apr 29, 2024

Address issue #1209 .
CI tests have high flaky failure rate due in part to propagation delays with VPCSC perimeter. Changing this to dry mode better adheres to best practices for enablement and will avoid this issue in CI tests.

@eeaton eeaton requested review from rjerrems, gtsorbo and a team as code owners April 29, 2024 09:56
@eeaton eeaton enabled auto-merge (squash) April 29, 2024 09:57
restricted_services = var.restricted_services
vpc_accessible_services = ["RESTRICTED-SERVICES"]
# configurations for a perimeter in dry run mode.
resources_dry_run = [var.project_number]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way the projects are added to the perimeter in step 4-projects also needs to be updated.

  vpc_service_control_attach_enabled = var.vpc_service_control_attach_enabled
  vpc_service_control_perimeter_name = var.vpc_service_control_perimeter_name
  vpc_service_control_sleep_duration = var.vpc_service_control_sleep_duration

https://github.com/terraform-google-modules/terraform-example-foundation/blob/master/4-projects/modules/single_project/main.tf#L62C1-L64C78

This cloud be a new flag like how it is coded in the project factory module

https://github.com/terraform-google-modules/terraform-google-project-factory/blob/master/modules/core_project_factory/main.tf#L347C1-L365C2

Copy link
Contributor

@amandakarina amandakarina Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey Eliot, just wondering if this change could be done with a flag instead of comment/uncomment the code, export in outputs to be read by other steps, like 4-projects as Daniel mentioned. It would also be less tricky for others blueprints that have documentation of how to deploy on top of foundation.

Copy link
Contributor

@daniel-cit daniel-cit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This fix also needs to be implemented in the hub and spoke flow
  • and also in the simple project module in step 4-projects
  • It would be helpful for the user to have a note on the main README and the READMEs from step 3 that the perimeter is created in dry mode and that it need to be updated to start enforcing the rules.

is it possible to have a flag vcp_sc_dry_run that could be set to true in the test instead of editing the code?

Copy link
Contributor

@fmichaelobrien fmichaelobrien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
either with/without dryrun flag - will be usefull

@eeaton
Copy link
Collaborator Author

eeaton commented Apr 30, 2024

Thanks Daniel & Amanda for the input.

ACK to add the changes consistently across step 4-projects, simple project module, and hub-and-spoke flow
Will add directions on the Readme about what to set for VPCSC.

re: the flag, like our discussion over chat I realize now we need separate input variables, because it is a valid use case for customers to set the enforced perimeter and dryrun perimeter to separate configurations (they are two different resources managed by the same CFT module). I'll draft this proposal and update the PR.

@eeaton
Copy link
Collaborator Author

eeaton commented May 2, 2024

Quick update: I'm blocked by a provider error on CFT.
terraform-google-modules/terraform-google-project-factory#904

The project factory module has partially designed support for an argument vpc_service_control_attach_dry_run but doesn't complete the implementation, it's not actually exposed to the Terraform Google Provider.

I've experimented locally with a few different patterns consistently that set the perimeter to dry_run mode in the 3-networks and 4-projects stage, but until the provider is fixed I don't think I can implement it using the current project factory module.

@daniel-cit
Copy link
Contributor

Quick update: I'm blocked by a provider error on CFT. terraform-google-modules/terraform-google-project-factory#904

The project factory module has partially designed support for an argument vpc_service_control_attach_dry_run but doesn't complete the implementation, it's not actually exposed to the Terraform Google Provider.

I've experimented locally with a few different patterns consistently that set the perimeter to dry_run mode in the 3-networks and 4-projects stage, but until the provider is fixed I don't think I can implement it using the current project factory module.

@eeaton new release for the project factory module v15.0.0

@eeaton
Copy link
Collaborator Author

eeaton commented May 3, 2024

I've made a few new commits, but no need to do a full review yet. I've got a working version of configuring the dry-run perimeter in my local environment for the dualsvpc pattern, but haven't tested it fully for hub-and-spoke (just copied off the key differences manually). I want to run the full CI tests to identify any configuration errors I might have introduced.

Once that is working well I'll add the readme changes and then ask for a review

@eeaton eeaton marked this pull request as draft May 24, 2024 15:01
auto-merge was automatically disabled May 24, 2024 15:01

Pull request was converted to draft

@daniel-cit
Copy link
Contributor

/gcbrun

eeaton added 2 commits June 28, 2024 08:07
… Result from ` gcloud beta access-context-manager supported-services list --format='get(NAME)' --filter="SUPPORT_STAGE=GA" `"

This reverts commit 600ce9e.
@eeaton eeaton marked this pull request as ready for review June 28, 2024 10:12
@eeaton eeaton requested a review from sleighton2022 as a code owner June 28, 2024 10:12
@daniel-cit
Copy link
Contributor

/gcbrun

@eeaton
Copy link
Collaborator Author

eeaton commented Jun 28, 2024

Looks like CI is repeatedly failing here, but I haven't been able to identify a more specific issue in the logs regarding what parts of the TestOrg specifically are failing:

Step #8 - "verify-org": FAIL
Step #8 - "verify-org": FAIL github.com/terraform-google-modules/terraform-example-foundation/test/integration/org 228.598s
Step #8 - "verify-org": FAIL

@daniel-cit
Copy link
Contributor

@eeaton
It looks like the verification is failing due to changes in the Terraform state.
The test framework checks (terraform plan) if there is any change in the Terraform state (Error 2) after the original apply.

Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: Note: Objects have changed outside of Terraform
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: 
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: Terraform detected the following changes made outside of Terraform since the
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: last "terraform apply" which may have affected this plan:
... 
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: Terraform used the selected providers to generate the following execution
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: plan. Resource actions are indicated with the following symbols:
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:   ~ update in-place
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: 
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: Terraform will perform the following actions:
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: 
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:   # module.cai_monitoring.module.cloud_function.google_cloudfunctions2_function.function will be updated in-place
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:   ~ resource "google_cloudfunctions2_function" "function" {
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:         id               = "projects/t7i-c-scc-ub3o/locations/us-central1/functions/caiMonitoring"
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:         name             = "caiMonitoring"
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:         # (11 unchanged attributes hidden)
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: 
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:       ~ service_config {
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:           ~ environment_variables            = {
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:               - "LOG_EXECUTION_ID" = "true" -> null <===
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:                 # (2 unchanged elements hidden)
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:             }
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:             # (14 unchanged attributes hidden)
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:         }
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: 
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:         # (2 unchanged blocks hidden)
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185:     }
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: 
Step #8 - "verify-org": TestOrg 2024-06-28T11:37:22Z command.go:185: Plan: 0 to add, 1 to change, 0 to destroy.
Step #8 - "verify-org":     terraform.go:504: 
Step #8 - "verify-org":         	Error Trace:	/builder/home/go/pkg/mod/github.com/!google!cloud!platform/cloud-foundation-toolkit/infra/[email protected]/pkg/tft/terraform.go:504
Step #8 - "verify-org":         	            				/workspace/test/integration/org/org_test.go:107
Step #8 - "verify-org":         	            				/builder/home/go/pkg/mod/github.com/!google!cloud!platform/cloud-foundation-toolkit/infra/[email protected]/pkg/tft/terraform.go:605
Step #8 - "verify-org":         	            				/builder/home/go/pkg/mod/github.com/!google!cloud!platform/cloud-foundation-toolkit/infra/[email protected]/pkg/tft/terraform.go:637
Step #8 - "verify-org":         	            				/builder/home/go/pkg/mod/github.com/!google!cloud!platform/cloud-foundation-toolkit/infra/[email protected]/pkg/utils/stages.go:31
Step #8 - "verify-org":         	            				/builder/home/go/pkg/mod/github.com/!google!cloud!platform/cloud-foundation-toolkit/infra/[email protected]/pkg/tft/terraform.go:637
Step #8 - "verify-org":         	Error:      	Should not be: 2
Step #8 - "verify-org":         	Test:       	TestOrg

the attribute with the changed value environment_variables.LOG_EXECUTION_ID is not declared in the foundation or in the Cloud Function module we are using, it looks like it was added by the API itself.

We may try to setup

environment_variables ={
"LOG_EXECUTION_ID" = "true"
}

to see if this will solve the issue.

I will do a local test to see if this will solve the problem.

@eeaton
Copy link
Collaborator Author

eeaton commented Jun 28, 2024

Got it, I've started seeing the same error on the other PR about fixing the SA, even though I don't expect that either PR changes things that should impact the TestOrg step, so that does suggest it's a change from the API itself that will break the CI on every branch.

Copy link
Contributor

@daniel-cit daniel-cit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for fixing the two errors in the test

test/integration/networks/networks_test.go Outdated Show resolved Hide resolved
test/integration/networks/networks_test.go Outdated Show resolved Hide resolved
@eeaton
Copy link
Collaborator Author

eeaton commented Jul 1, 2024

The tests in networks_test.go passed, but now CI failed for a similar test in projects_tests.go. I've applied similar logic that we used in networks_test.go, to check project membership against the dry-run diff object.

@eeaton
Copy link
Collaborator Author

eeaton commented Jul 2, 2024

@daniel-cit Woo hoo, all tests are finally green! Thanks again for all the help with this PR.

When you have a chance can you review and approve please?

@daniel-cit
Copy link
Contributor

@daniel-cit Woo hoo, all tests are finally green! Thanks again for all the help with this PR.

When you have a chance can you review and approve please?

I will do a deploy to test the upgrade path
https://github.com/eeaton/terraform-example-foundation/blob/fix-1209-vpcsc-dryrun/3-networks-dual-svpc/README.md#optional-enforce-vpc-service-controls

Copy link
Contributor

@daniel-cit daniel-cit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eeaton
Copy link
Collaborator Author

eeaton commented Jul 3, 2024

@apeabody can I please get your formal review as a code owner to merge?
I've already done a detailed functional review with Daniel

Copy link
Contributor

@apeabody apeabody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eeaton!

Approved by @daniel-cit

@apeabody apeabody enabled auto-merge (squash) July 3, 2024 15:03
@daniel-cit
Copy link
Contributor

/gcbrun

@apeabody apeabody merged commit 4365eab into terraform-google-modules:master Jul 4, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants