Skip to content

Commit

Permalink
docs: Improve 4-projects readme and troubleshooting (terraform-google…
Browse files Browse the repository at this point in the history
…-modules#998)

Co-authored-by: Daniel Andrade <[email protected]>
Co-authored-by: Rohan Jerrems <[email protected]>
  • Loading branch information
3 people authored Nov 15, 2023
1 parent 511f5cb commit 79455c9
Show file tree
Hide file tree
Showing 2 changed files with 89 additions and 1 deletion.
2 changes: 2 additions & 0 deletions 4-projects/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ Other Workspaces can also be created to isolate deployments if needed.

**Note:** Make sure that you use version 1.3.0 or later of Terraform throughout this series. Otherwise, you might experience Terraform state snapshot lock errors.

**Note 2:** As mentioned in 0-bootstrap [README note 2](../0-bootstrap/README.md#deploying-with-cloud-build) at the end of Cloud Build deploy section, make sure that you have requested at least 50 additional projects for the **projects step service account**, otherwise you may face a project quota exceeded error message during the following steps and you will need to apply the fix from [this entry](../docs/TROUBLESHOOTING.md#attempt-to-run-4-projects-step-without-enough-project-quota) of the Troubleshooting guide in order to continue.

### Troubleshooting

Please refer to [troubleshooting](../docs/TROUBLESHOOTING.md) if you run into issues during this step.
Expand Down
88 changes: 87 additions & 1 deletion docs/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ See [GLOSSARY.md](./GLOSSARY.md).
- [Cannot assign requested address error in Cloud Shell](#cannot-assign-requested-address-error-in-cloud-shell)
- [Error: Unsupported attribute](#error-unsupported-attribute)
- [Error: Error adding network peering](#error-error-adding-network-peering)

- [Error: Unknown project id on 4-project step context](#error-unknown-project-id-on-4-project-step-context)
- - -

### Project quota exceeded
Expand Down Expand Up @@ -248,6 +248,92 @@ In a deploy using the [Hub and Spoke](https://cloud.google.com/architecture/secu

This is a transient error and the deploy can be retried. Wait for at least a minute and retry the deploy.


### Error: Unknown project id on 4-project step context

**Error message:**

```text
Error 400: Unknown project id: 'prj-<business-unity>-<environment>-sample-base-<random-suffix>', invalid
```

**Cause:**

When you try to run 4-projects step without requesting additional project quota for **project service account created in 0-bootstrap step** you may face the error above, even after the project quota issue is resolved, due to an inconsistency in terraform state.

**Solution:**

- Make sure you [have requested the additional project quota](#project-quota-exceeded) for the **project SA e-mail** before running the following steps.

You will need to mark some Terraform resources as **tainted** in order to trigger the recreation of the missing projects to fix the inconsistent in the terraform state.

1. In a terminal, navigate to the path where the error is being reported.

For example, if the unknown project ID is `prj-bu1-p-sample-base-abcd`, you should go to ./gcp-projects/business_unit_1/production (`business_unit_1` due to `bu1` and `production` due to `p`, see the Security Foundations [naming conventions](https://cloud.google.com/architecture/security-foundations/using-example-terraform#naming_conventions) for more information on the projects naming guideline).

```bash
cd ./gcp-projects/<business_unit>/<environment>
```

1. Run the `terraform init` command so you can pull the remote state.

```bash
terraform init
```

1. Run the `terraform state list` command, filtering by `random_project_id_suffix`.
This command will give you all the expected projects that should be created for this BU and environment that uses a random suffix.

```bash
terraform state list | grep random_project_id_suffix
```

1. Identify the folder which is the parent of the projects of the environment.
If the Terraform Example Foundation is deployed directly under the organization use `--organization`, if the Terraform Example Foundation is deployed under a folder use `--folder`. The "ORGANIZATION_ID" and "PARENT_FOLDER" are the input values provided for the 0-bootstrap step.

```bash
gcloud resource-manager folders list [ --organization=ORGANIZATION_ID ][ --folder=PARENT_FOLDER ]
```

1. The result of the `gcloud` command will look like the following output.
Using the `production` environment for this example, the folder ID for the environment would be `333333333333`.

```
DISPLAY_NAME PARENT_NAME ID
fldr-bootstrap folders/PARENT_FOLDER 111111111111
fldr-common folders/PARENT_FOLDER 222222222222
fldr-production folders/PARENT_FOLDER 333333333333
fldr-non-production folders/PARENT_FOLDER 444444444444
fldr-development folders/PARENT_FOLDER 555555555555
```

1. Run the `gcloud projects list` command to.
Replace `id_of_the_environment_folder` with the proper ID of the folder retrieved in the previous step.
This command will give you all the projects that were actually created.

```bash
gcloud projects list --filter="parent=<id_of_the_environment_folder>"
```

1. For each resource listed in the `terraform state` step for a project that is **not** returned by the `gcloud projects list` step, we should mark that resource as tainted to force it to be recreated in order to fix the inconsistency in the terraform state.

```bash
terraform taint <resource>[index]
```

For example, in the following command we are marking as tainted the env secrets project. You may need to run the `terraform taint` command multiple times, depending on how many missing projects you have.

```bash
terraform taint module.env.module.env_secrets_project.module.project.module.project-factory.random_string.random_project_id_suffix[0]
```

1. After running the `terraform taint` command for all the non-matching items, go to Cloud Build and trigger a retry action for the failed job.
This should complete successfully, if you encounter another similar error for another BU/environment that will require you to follow this guide again but instead changing paths according to the BU/environment reported in the error log.

**Notes:**

- Make sure you run the taint command just for the resources that contain the [number] at the end of the line returned by terraform state list step. You don't need to run for the groups (the resources that don't have the [] at the end).

- - -

### Caller does not have permission in the Organization
Expand Down

0 comments on commit 79455c9

Please sign in to comment.