Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow google_s3_mirror to read from staging #1842

Merged
merged 1 commit into from
Oct 24, 2024

Conversation

richardTowers
Copy link
Contributor

... as well as integration.

The idea here is to mirror from staging instead of integration, so that we can reduce the frequency that the integration databases are restored. See alphagov/govuk-helm-charts#2719 for more context.

Initially, we need to allow reading from both environments, otherwise we'll break the current mirroring from integration. There's probably no harm in it being able to read both in the long term, but strictly speaking it should only need staging once we've switched it over.

This produces no-changes plans in integration and production. In staging we have:

Terraform will perform the following actions:

  # aws_iam_policy.google-s3-mirror[0] will be created
  + resource "aws_iam_policy" "google-s3-mirror" {
      + arn         = (known after apply)
      + description = "Allows a Google Cloud Platform project to mirror S3 buckets."
      + id          = (known after apply)
      + name        = "google-s3-mirror"
      + path        = "/"
      + policy      = jsonencode(
            {
              + Statement = [
                  + {
                      + Action   = [
                          + "s3:List*",
                          + "s3:Get*",
                        ]
                      + Effect   = "Allow"
                      + Resource = [
                          + "arn:aws:s3:::govuk-staging-database-backups/*",
                          + "arn:aws:s3:::govuk-staging-database-backups",
                        ]
                      + Sid      = "GoogleReadBucket"
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + policy_id   = (known after apply)
      + tags_all    = (known after apply)
    }

  # aws_iam_role.google-s3-mirror[0] will be created
  + resource "aws_iam_role" "google-s3-mirror" {
      + arn                   = (known after apply)
      + assume_role_policy    = jsonencode(
            {
              + Statement = [
                  + {
                      + Action    = "sts:AssumeRoleWithWebIdentity"
                      + Condition = {
                          + StringEquals = {
                              + accounts.google.com:sub = "107768730699967087212"
                            }
                        }
                      + Effect    = "Allow"
                      + Principal = {
                          + Federated = "accounts.google.com"
                        }
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + create_date           = (known after apply)
      + force_detach_policies = false
      + id                    = (known after apply)
      + managed_policy_arns   = (known after apply)
      + max_session_duration  = 3600
      + name                  = "google-s3-mirror"
      + name_prefix           = (known after apply)
      + path                  = "/"
      + tags_all              = (known after apply)
      + unique_id             = (known after apply)

      + inline_policy {
          + name   = (known after apply)
          + policy = (known after apply)
        }
    }

  # aws_iam_role_policy_attachment.google-s3-mirror-access[0] will be created
  + resource "aws_iam_role_policy_attachment" "google-s3-mirror-access" {
      + id         = (known after apply)
      + policy_arn = (known after apply)
      + role       = "google-s3-mirror"
    }

Plan: 3 to add, 0 to change, 0 to destroy.

... as well as integration.

The idea here is to mirror from staging instead of integration, so that
we can reduce the frequency that the integration databases are restored.
See alphagov/govuk-helm-charts#2719 for more
context.

Initially, we need to allow reading from both environments, otherwise
we'll break the current mirroring from integration. There's probably no
harm in it being able to read both in the long term, but strictly
speaking it should only need staging once we've switched it over.

This produces no-changes plans in integration and production. In staging
we have:

    Terraform will perform the following actions:

      # aws_iam_policy.google-s3-mirror[0] will be created
      + resource "aws_iam_policy" "google-s3-mirror" {
          + arn         = (known after apply)
          + description = "Allows a Google Cloud Platform project to mirror S3 buckets."
          + id          = (known after apply)
          + name        = "google-s3-mirror"
          + path        = "/"
          + policy      = jsonencode(
                {
                  + Statement = [
                      + {
                          + Action   = [
                              + "s3:List*",
                              + "s3:Get*",
                            ]
                          + Effect   = "Allow"
                          + Resource = [
                              + "arn:aws:s3:::govuk-staging-database-backups/*",
                              + "arn:aws:s3:::govuk-staging-database-backups",
                            ]
                          + Sid      = "GoogleReadBucket"
                        },
                    ]
                  + Version   = "2012-10-17"
                }
            )
          + policy_id   = (known after apply)
          + tags_all    = (known after apply)
        }

      # aws_iam_role.google-s3-mirror[0] will be created
      + resource "aws_iam_role" "google-s3-mirror" {
          + arn                   = (known after apply)
          + assume_role_policy    = jsonencode(
                {
                  + Statement = [
                      + {
                          + Action    = "sts:AssumeRoleWithWebIdentity"
                          + Condition = {
                              + StringEquals = {
                                  + accounts.google.com:sub = "107768730699967087212"
                                }
                            }
                          + Effect    = "Allow"
                          + Principal = {
                              + Federated = "accounts.google.com"
                            }
                        },
                    ]
                  + Version   = "2012-10-17"
                }
            )
          + create_date           = (known after apply)
          + force_detach_policies = false
          + id                    = (known after apply)
          + managed_policy_arns   = (known after apply)
          + max_session_duration  = 3600
          + name                  = "google-s3-mirror"
          + name_prefix           = (known after apply)
          + path                  = "/"
          + tags_all              = (known after apply)
          + unique_id             = (known after apply)

          + inline_policy {
              + name   = (known after apply)
              + policy = (known after apply)
            }
        }

      # aws_iam_role_policy_attachment.google-s3-mirror-access[0] will be created
      + resource "aws_iam_role_policy_attachment" "google-s3-mirror-access" {
          + id         = (known after apply)
          + policy_arn = (known after apply)
          + role       = "google-s3-mirror"
        }

    Plan: 3 to add, 0 to change, 0 to destroy.
@richardTowers richardTowers merged commit 81ee9f6 into main Oct 24, 2024
2 checks passed
@richardTowers richardTowers deleted the google_s3_mirror_read_staging branch October 24, 2024 08:41
richardTowers added a commit to alphagov/govuk-s3-mirror that referenced this pull request Oct 24, 2024
The idea here is to mirror from staging instead of integration, so that
we can reduce the frequency that the integration databases are restored.
See alphagov/govuk-helm-charts#2719 for more
context.

We created the role in AWS which allows govuk-s3-mirror to read this
bucket in alphagov/govuk-aws#1842

There are many many resources in this project which have "integration"
in their names, which could be confusing. I'll rename these in a
subsequent commit.
richardTowers added a commit to alphagov/govuk-s3-mirror that referenced this pull request Oct 28, 2024
This duplicates the pubsub.tf and transfer.tf files, renaming the old
versions to -integration and marking them as legacy with a comment.

The names of the resources in the new versions don't refer to the
environment they're reading from, because it shouldn't matter.

In practice, they will read from staging for now. We created the role in
AWS which allows govuk-s3-mirror to read this bucket in alphagov/govuk-aws#1842
One day we might read directly from production, but this would require a
little bit of additional work to do the same sanitization that the
backup process does in staging.

Terrform plan:

    Terraform will perform the following actions:

      # google_pubsub_subscription.govuk_database_backups will be created
      + resource "google_pubsub_subscription" "govuk_database_backups" {
          + ack_deadline_seconds       = (known after apply)
          + effective_labels           = {
              + "goog-terraform-provisioned" = "true"
            }
          + enable_message_ordering    = false
          + id                         = (known after apply)
          + message_retention_duration = "604800s"
          + name                       = "govuk-database-backups"
          + project                    = "govuk-s3-mirror"
          + retain_acked_messages      = true
          + terraform_labels           = {
              + "goog-terraform-provisioned" = "true"
            }
          + topic                      = "govuk-database-backups"

          + expiration_policy {}
        }

      # google_pubsub_topic.govuk_database_backups will be created
      + resource "google_pubsub_topic" "govuk_database_backups" {
          + effective_labels           = {
              + "goog-terraform-provisioned" = "true"
            }
          + id                         = (known after apply)
          + message_retention_duration = "604800s"
          + name                       = "govuk-database-backups"
          + project                    = "govuk-s3-mirror"
          + terraform_labels           = {
              + "goog-terraform-provisioned" = "true"
            }

          + message_storage_policy {
              + allowed_persistence_regions = [
                  + "europe-west2",
                ]
            }
        }

      # google_pubsub_topic_iam_policy.govuk_database_backups will be created
      + resource "google_pubsub_topic_iam_policy" "govuk_database_backups" {
          + etag        = (known after apply)
          + id          = (known after apply)
          + policy_data = jsonencode(
                {
                  + bindings = [
                      + {
                          + members = [
                              + "REDACTED",
                            ]
                          + role    = "roles/pubsub.publisher"
                        },
                    ]
                }
            )
          + project     = (known after apply)
          + topic       = "govuk-database-backups"
        }

      # google_storage_bucket.govuk_database_backups will be created
      + resource "google_storage_bucket" "govuk_database_backups" {
          + effective_labels            = {
              + "goog-terraform-provisioned" = "true"
            }
          + force_destroy               = false
          + id                          = (known after apply)
          + location                    = "EUROPE-WEST2"
          + name                        = "govuk-s3-mirror_govuk-database-backups"
          + project                     = (known after apply)
          + project_number              = (known after apply)
          + public_access_prevention    = (known after apply)
          + rpo                         = (known after apply)
          + self_link                   = (known after apply)
          + storage_class               = "STANDARD"
          + terraform_labels            = {
              + "goog-terraform-provisioned" = "true"
            }
          + uniform_bucket_level_access = true
          + url                         = (known after apply)

          + versioning {
              + enabled = false
            }
        }

      # google_storage_bucket_iam_policy.govuk_database_backups will be created
      + resource "google_storage_bucket_iam_policy" "govuk_database_backups" {
          + bucket      = "govuk-s3-mirror_govuk-database-backups"
          + etag        = (known after apply)
          + id          = (known after apply)
          + policy_data = jsonencode(
                {
                  + bindings = [
                      + {
                          + members = [
                              + "REDACTED",
                            ]
                          + role    = "roles/storage.admin"
                        },
                      + {
                          + members = [
                              + "projectEditor:govuk-s3-mirror",
                              + "projectOwner:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyBucketOwner"
                        },
                      + {
                          + members = [
                              + "projectViewer:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyBucketReader"
                        },
                      + {
                          + members = [
                              + "projectEditor:govuk-s3-mirror",
                              + "projectOwner:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyObjectOwner"
                        },
                      + {
                          + members = [
                              + "projectViewer:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyObjectReader"
                        },
                      + {
                          + members = [ "REDACTED" ]
                          + role    = "roles/storage.objectViewer"
                        },
                    ]
                }
            )
        }

      # google_storage_notification.govuk_database_backups will be created
      + resource "google_storage_notification" "govuk_database_backups" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = (known after apply)
        }

      # google_storage_notification.govuk_database_backups-govuk_knowledge_graph will be created
      + resource "google_storage_notification" "govuk_database_backups-govuk_knowledge_graph" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = "/projects/govuk-knowledge-graph/topics/govuk-database-backups"
        }

      # google_storage_notification.govuk_database_backups-govuk_knowledge_graph_dev will be created
      + resource "google_storage_notification" "govuk_database_backups-govuk_knowledge_graph_dev" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = "/projects/govuk-knowledge-graph-dev/topics/govuk-database-backups"
        }

      # google_storage_notification.govuk_database_backups-govuk_knowledge_graph_staging will be created
      + resource "google_storage_notification" "govuk_database_backups-govuk_knowledge_graph_staging" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = "/projects/govuk-knowledge-graph-staging/topics/govuk-database-backups"
        }

      # google_storage_transfer_job.govuk-integration-database-backups will be updated in-place
      ~ resource "google_storage_transfer_job" "govuk-integration-database-backups" {
            id                     = "govuk-s3-mirror/REDACTED"
            name                   = "transferJobs/REDACTED"
            # (5 unchanged attributes hidden)

          ~ transfer_spec {
              ~ aws_s3_data_source {
                    # (2 unchanged attributes hidden)

                  - aws_access_key {}
                }

                # (3 unchanged blocks hidden)
            }

            # (1 unchanged block hidden)
        }

      # google_storage_transfer_job.govuk_database_backups will be created
      + resource "google_storage_transfer_job" "govuk_database_backups" {
          + creation_time          = (known after apply)
          + deletion_time          = (known after apply)
          + description            = "Mirror the GOV.UK S3 bucket govuk-staging-database-backups"
          + id                     = (known after apply)
          + last_modification_time = (known after apply)
          + name                   = (known after apply)
          + project                = "govuk-s3-mirror"
          + status                 = "ENABLED"

          + schedule {
              + repeat_interval = "3600s"

              + schedule_start_date {
                  + day   = 7
                  + month = 9
                  + year  = 2022
                }

              + start_time_of_day {
                  + hours   = 0
                  + minutes = 0
                  + nanos   = 0
                  + seconds = 0
                }
            }

          + transfer_spec {
              + sink_agent_pool_name   = (known after apply)
              + source_agent_pool_name = (known after apply)

              + aws_s3_data_source {
                  + bucket_name = "govuk-staging-database-backups"
                  + role_arn    = "arn:aws:iam::696911096973:policy/google-s3-mirror"
                }

              + gcs_data_sink {
                  + bucket_name = "govuk-s3-mirror_govuk-database-backups"
                  + path        = (known after apply)
                }

              + object_conditions {
                  + include_prefixes = [
                      + "content-store-postgres/",
                      + "mongo-api/",
                      + "publishing-api-postgres/",
                      + "shared-documentdb/",
                      + "support-api-postgres/",
                    ]
                }

              + transfer_options {
                  + delete_objects_from_source_after_transfer  = false
                  + delete_objects_unique_in_sink              = true
                  + overwrite_objects_already_existing_in_sink = false
                }
            }
        }

    Plan: 10 to add, 1 to change, 0 to destroy.
richardTowers added a commit to alphagov/govuk-s3-mirror that referenced this pull request Oct 28, 2024
This duplicates the pubsub.tf and transfer.tf files, suffixing the new
versions to -staging and marking the old versions as legacy with a comment.

A couple of data blocks only need to exist in one place, so I've moved
those to the new files.

The names of the resources in the new versions don't refer to the
environment they're reading from, because it shouldn't matter.

In practice, they will read from staging for now. We created the role in
AWS which allows govuk-s3-mirror to read this bucket in alphagov/govuk-aws#1842
One day we might read directly from production, but this would require a
little bit of additional work to do the same sanitization that the
backup process does in staging.

Terrform plan:

    Terraform will perform the following actions:

      # google_pubsub_subscription.govuk_database_backups will be created
      + resource "google_pubsub_subscription" "govuk_database_backups" {
          + ack_deadline_seconds       = (known after apply)
          + effective_labels           = {
              + "goog-terraform-provisioned" = "true"
            }
          + enable_message_ordering    = false
          + id                         = (known after apply)
          + message_retention_duration = "604800s"
          + name                       = "govuk-database-backups"
          + project                    = "govuk-s3-mirror"
          + retain_acked_messages      = true
          + terraform_labels           = {
              + "goog-terraform-provisioned" = "true"
            }
          + topic                      = "govuk-database-backups"

          + expiration_policy {}
        }

      # google_pubsub_topic.govuk_database_backups will be created
      + resource "google_pubsub_topic" "govuk_database_backups" {
          + effective_labels           = {
              + "goog-terraform-provisioned" = "true"
            }
          + id                         = (known after apply)
          + message_retention_duration = "604800s"
          + name                       = "govuk-database-backups"
          + project                    = "govuk-s3-mirror"
          + terraform_labels           = {
              + "goog-terraform-provisioned" = "true"
            }

          + message_storage_policy {
              + allowed_persistence_regions = [
                  + "europe-west2",
                ]
            }
        }

      # google_pubsub_topic_iam_policy.govuk_database_backups will be created
      + resource "google_pubsub_topic_iam_policy" "govuk_database_backups" {
          + etag        = (known after apply)
          + id          = (known after apply)
          + policy_data = jsonencode(
                {
                  + bindings = [
                      + {
                          + members = [
                              + "REDACTED",
                            ]
                          + role    = "roles/pubsub.publisher"
                        },
                    ]
                }
            )
          + project     = (known after apply)
          + topic       = "govuk-database-backups"
        }

      # google_storage_bucket.govuk_database_backups will be created
      + resource "google_storage_bucket" "govuk_database_backups" {
          + effective_labels            = {
              + "goog-terraform-provisioned" = "true"
            }
          + force_destroy               = false
          + id                          = (known after apply)
          + location                    = "EUROPE-WEST2"
          + name                        = "govuk-s3-mirror_govuk-database-backups"
          + project                     = (known after apply)
          + project_number              = (known after apply)
          + public_access_prevention    = (known after apply)
          + rpo                         = (known after apply)
          + self_link                   = (known after apply)
          + storage_class               = "STANDARD"
          + terraform_labels            = {
              + "goog-terraform-provisioned" = "true"
            }
          + uniform_bucket_level_access = true
          + url                         = (known after apply)

          + versioning {
              + enabled = false
            }
        }

      # google_storage_bucket_iam_policy.govuk_database_backups will be created
      + resource "google_storage_bucket_iam_policy" "govuk_database_backups" {
          + bucket      = "govuk-s3-mirror_govuk-database-backups"
          + etag        = (known after apply)
          + id          = (known after apply)
          + policy_data = jsonencode(
                {
                  + bindings = [
                      + {
                          + members = [
                              + "REDACTED",
                            ]
                          + role    = "roles/storage.admin"
                        },
                      + {
                          + members = [
                              + "projectEditor:govuk-s3-mirror",
                              + "projectOwner:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyBucketOwner"
                        },
                      + {
                          + members = [
                              + "projectViewer:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyBucketReader"
                        },
                      + {
                          + members = [
                              + "projectEditor:govuk-s3-mirror",
                              + "projectOwner:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyObjectOwner"
                        },
                      + {
                          + members = [
                              + "projectViewer:govuk-s3-mirror",
                            ]
                          + role    = "roles/storage.legacyObjectReader"
                        },
                      + {
                          + members = [ "REDACTED" ]
                          + role    = "roles/storage.objectViewer"
                        },
                    ]
                }
            )
        }

      # google_storage_notification.govuk_database_backups will be created
      + resource "google_storage_notification" "govuk_database_backups" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = (known after apply)
        }

      # google_storage_notification.govuk_database_backups-govuk_knowledge_graph will be created
      + resource "google_storage_notification" "govuk_database_backups-govuk_knowledge_graph" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = "/projects/govuk-knowledge-graph/topics/govuk-database-backups"
        }

      # google_storage_notification.govuk_database_backups-govuk_knowledge_graph_dev will be created
      + resource "google_storage_notification" "govuk_database_backups-govuk_knowledge_graph_dev" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = "/projects/govuk-knowledge-graph-dev/topics/govuk-database-backups"
        }

      # google_storage_notification.govuk_database_backups-govuk_knowledge_graph_staging will be created
      + resource "google_storage_notification" "govuk_database_backups-govuk_knowledge_graph_staging" {
          + bucket          = "govuk-s3-mirror_govuk-database-backups"
          + event_types     = [
              + "OBJECT_FINALIZE",
            ]
          + id              = (known after apply)
          + notification_id = (known after apply)
          + payload_format  = "JSON_API_V1"
          + self_link       = (known after apply)
          + topic           = "/projects/govuk-knowledge-graph-staging/topics/govuk-database-backups"
        }

      # google_storage_transfer_job.govuk-integration-database-backups will be updated in-place
      ~ resource "google_storage_transfer_job" "govuk-integration-database-backups" {
            id                     = "govuk-s3-mirror/REDACTED"
            name                   = "transferJobs/REDACTED"
            # (5 unchanged attributes hidden)

          ~ transfer_spec {
              ~ aws_s3_data_source {
                    # (2 unchanged attributes hidden)

                  - aws_access_key {}
                }

                # (3 unchanged blocks hidden)
            }

            # (1 unchanged block hidden)
        }

      # google_storage_transfer_job.govuk_database_backups will be created
      + resource "google_storage_transfer_job" "govuk_database_backups" {
          + creation_time          = (known after apply)
          + deletion_time          = (known after apply)
          + description            = "Mirror the GOV.UK S3 bucket govuk-staging-database-backups"
          + id                     = (known after apply)
          + last_modification_time = (known after apply)
          + name                   = (known after apply)
          + project                = "govuk-s3-mirror"
          + status                 = "ENABLED"

          + schedule {
              + repeat_interval = "3600s"

              + schedule_start_date {
                  + day   = 7
                  + month = 9
                  + year  = 2022
                }

              + start_time_of_day {
                  + hours   = 0
                  + minutes = 0
                  + nanos   = 0
                  + seconds = 0
                }
            }

          + transfer_spec {
              + sink_agent_pool_name   = (known after apply)
              + source_agent_pool_name = (known after apply)

              + aws_s3_data_source {
                  + bucket_name = "govuk-staging-database-backups"
                  + role_arn    = "arn:aws:iam::696911096973:policy/google-s3-mirror"
                }

              + gcs_data_sink {
                  + bucket_name = "govuk-s3-mirror_govuk-database-backups"
                  + path        = (known after apply)
                }

              + object_conditions {
                  + include_prefixes = [
                      + "content-store-postgres/",
                      + "mongo-api/",
                      + "publishing-api-postgres/",
                      + "shared-documentdb/",
                      + "support-api-postgres/",
                    ]
                }

              + transfer_options {
                  + delete_objects_from_source_after_transfer  = false
                  + delete_objects_unique_in_sink              = true
                  + overwrite_objects_already_existing_in_sink = false
                }
            }
        }

    Plan: 10 to add, 1 to change, 0 to destroy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants