Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider update maintenancy_policy after cluster version update #20556

Open
ericzzzzzzz opened this issue Dec 3, 2024 · 3 comments · May be fixed by GoogleCloudPlatform/magic-modules#12569
Open

Comments

@ericzzzzzzz
Copy link

ericzzzzzzz commented Dec 3, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to a user, that user is claiming responsibility for the issue.
  • Customers working with a Google Technical Account Manager or Customer Engineer can ask them to reach out internally to expedite investigation and resolution of this issue.

Terraform Version & Provider Version(s)

Terraform vX.X.X
on

  • provider registry.terraform.io/hashicorp/google vX.X.X
  • provider registry.terraform.io/hashicorp/google-beta vX.X.X

Affected Resource(s)

google_container_cluster

Terraform Configuration

Debug Output

No response

Expected Behavior

maintenance_policy.maintenance_exclusion update should happen after min_master_version update or these two updates should happen in the same request, this allows users to configure maintenance_exclusion end_time for cluster version specified in min_master_version, if both are set explicitly.

Actual Behavior

When updating min_master_version and maintenance_exclusion at the same time, we could get an error, as the maintenance_exclusion update request sent before min_master_version update, so gcp side check the maintenance_exclusion.end_time against the existing cluster version which may not be the same as what specified in terraform file, the new min_master_version may be newer than the existing one and the exclusion_rule.end_time was set for that new version, hence it is possible the end_time exceeds the existing cluster support date, then the gcp server will throw us an error.

Steps to reproduce

  1. terraform apply

Important Factoids

No response

References

No response

b/382558706

@ericzzzzzzz ericzzzzzzz added the bug label Dec 3, 2024
@github-actions github-actions bot added forward/review In review; remove label to forward service/container labels Dec 3, 2024
@NickElliot NickElliot self-assigned this Dec 3, 2024
@NickElliot
Copy link
Collaborator

NickElliot commented Dec 5, 2024

Should be a straight forward update assuming this is an allowable order of operations change.

@NickElliot NickElliot removed their assignment Dec 5, 2024
@NickElliot NickElliot removed the forward/review In review; remove label to forward label Dec 5, 2024
@ericzzzzzzz
Copy link
Author

ericzzzzzzz commented Dec 6, 2024

Hi @NickElliot Thank you for the reply, here is the reproducible step:

  1. create gke cluster
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 6.0"
    }
  }
}

provider "google" {
  project = "<project-id>"
  region  = "us-central1"  # Or your preferred region
  zone    = "us-central1-a" # Or your preferred zone
}

resource "google_container_cluster" "primary" {
  name               = "test-gke-cluster-1"
  location           = "us-central1"  # Or your preferred region
  initial_node_count       = 1
  remove_default_node_pool = true
  master_auth {
    client_certificate_config {
      issue_client_certificate = false
    }
  }
}
resource "google_container_node_pool" "default-pool" {
    name       = "default-node-pool"
    location   = "us-central1"
    cluster    = google_container_cluster.primary.name
    node_count = 1

     node_config {
        machine_type = "e2-medium"
         oauth_scopes = [
            "https://www.googleapis.com/auth/compute",
            "https://www.googleapis.com/auth/devstorage.read_only",
            "https://www.googleapis.com/auth/logging.write",
            "https://www.googleapis.com/auth/monitoring",
        ]
    }
}
output "cluster_name" {
  value = google_container_cluster.primary.name
}

output "cluster_endpoint" {
  value = google_container_cluster.primary.endpoint
}

output "node_pool_name" {
  value = google_container_node_pool.default-pool.name
}

at the moment, the created cluster should be with default version which is 1.30.* in Regular Channel, and the support date end on 2025-09-30 as it shows here, https://cloud.google.com/kubernetes-engine/docs/release-schedule

  1. add min_master_version to upgrade cluster to 1.31 and maintenance_policy to configure maintenance_exclusion
terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 6.0"
    }
  }
}

provider "google" {
  project = "<project-id>"
  region  = "us-central1"  # Or your preferred region
  zone    = "us-central1-a" # Or your preferred zone
}

resource "google_container_cluster" "primary" {
  name               = "test-gke-cluster-1"
  location           = "us-central1"  # Or your preferred region
  initial_node_count       = 1
  remove_default_node_pool = true
  min_master_version = "1.31"

  maintenance_policy {
    recurring_window {
      start_time = "2019-01-01T00:00:00Z"
      end_time = "2019-01-02T00:00:00Z"
      recurrence = "FREQ=DAILY"
    }
    maintenance_exclusion{
      exclusion_name = "testname"
      start_time = "2019-05-01T00:00:00Z"
      end_time = "2025-10-30T00:00:00Z"
      exclusion_options {
        scope = "NO_MINOR_UPGRADES"
      }
    }
  }
  master_auth {
    client_certificate_config {
      issue_client_certificate = false
    }
  }
}
resource "google_container_node_pool" "default-pool" {
    name       = "default-node-pool"
    location   = "us-central1"
    cluster    = google_container_cluster.primary.name
    node_count = 1

     node_config {
        machine_type = "e2-medium"
         oauth_scopes = [
            "https://www.googleapis.com/auth/compute",
            "https://www.googleapis.com/auth/devstorage.read_only",
            "https://www.googleapis.com/auth/logging.write",
            "https://www.googleapis.com/auth/monitoring",
        ]
    }
}
output "cluster_name" {
  value = google_container_cluster.primary.name
}

output "cluster_endpoint" {
  value = google_container_cluster.primary.endpoint
}

output "node_pool_name" {
  value = google_container_node_pool.default-pool.name
}

note that the end time is set to 2025-10-30 which is beyond 1.30 support period, but this is a valid value for 1.31 (<2025-12-22), when apply the above config, we will get

Error: googleapi: Error 400: MaintenancePolicy.maintenanceExclusions[“testname”].endTime needs to be before minor version 1.30 end of life: (2025-9-30). See release schedule at https://cloud.google.com/kubernetes-engine/docs/release-schedule.

@cslink
Copy link

cslink commented Dec 12, 2024

I can look at this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants