Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

public gke with oidc not removing load balancer forwardingRules #2286

Open
yossig-runai opened this issue Feb 19, 2025 · 0 comments
Open

public gke with oidc not removing load balancer forwardingRules #2286

yossig-runai opened this issue Feb 19, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@yossig-runai
Copy link

yossig-runai commented Feb 19, 2025

Hi, hopping its the right place.we are using very old module version: v28.0
we are trying to move to the latest version bug we are facing 2 issues:

  1. the gke-oidc-envoy load balancer switched to internal.
  2. when deleing the env the forwardingRules of the envoy service load balancer are not deleting and we are getting the following error:
    │ Error: Error when reading or editing Subnetwork: googleapi: Error 400: The subnetwork resource 'projects/xxx-lab/regions/us-east4/subnetworks/yossi-gke-new-subnetwork' is already being used by 'projects/xxx-lab/regions/us-east4/forwardingRules/a24aff4edcd304349b5486a4d4cdd838', resourceInUseByAnotherResource

Expected behavior

all setup will destroy with success

Observed behavior

No response

Terraform Configuration

{
  tags = {
    for key, value in var.tags : lower(key) => lower(value)
  }
  gpus = {
    "fake" = length(var.gpus.fake) > 0 ? [for gpu in var.gpus.fake : {
      gpuNodesCount = gpu.gpuNodesCount == 0 || gpu.gpuNodesCount == "" ? var.instance_count : gpu.gpuNodesCount
      nodePoolName  = gpu.nodePoolName
      }] : [{
      gpuNodesCount = var.instance_count
      nodePoolName  = "default"
    }]
    "real" = concat([{
      gpuNodesCount   = var.instance_count
      gpuInstanceType = var.instance_type
      gpuProduct      = ""
      gpuCount        = 0
      cosImageType    = var.ubuntu_image_type
      }], [for gpu in var.gpus.real : {
      gpuNodesCount   = gpu.gpuNodesCount
      gpuInstanceType = gpu.gpuInstanceType
      gpuProduct      = gpu.gpuProduct
      gpuCount        = gpu.gpuCount
      cosImageType    = var.cos_image_type
    }])
  }
  release_channel = tonumber(var.kubernetes_version) < 1.29 ? "EXTENDED" : "REGULAR"
}

resource "google_service_account" "default" {
  account_id   = var.name
  display_name = var.name
  project      = var.project
}

data "google_client_config" "default" {}

resource "google_compute_network" "vpc_network" {
  name    = "${var.name}-network"
  project = var.project
}

resource "google_compute_subnetwork" "network-with-private-secondary-ip-ranges" {
  name          = "${var.name}-subnetwork"
  ip_cidr_range = "10.2.0.0/16"
  region        = var.region
  network       = "${var.name}-network"
  project       = var.project
  secondary_ip_range {
    range_name    = "${var.name}-secondary-ip-range-pods"
    ip_cidr_range = "10.44.0.0/14"
  }
  secondary_ip_range {
    range_name    = "${var.name}-secondary-ip-range-services"
    ip_cidr_range = "10.48.0.0/20"
  }
  depends_on = [google_compute_network.vpc_network]
}

module "gke" {
  source                  = "terraform-google-modules/kubernetes-engine/google//modules/beta-public-cluster"
  version                 = "36.0.2"
  project_id              = var.project
  name                    = var.name
  region                  = var.region
  zones                   = ["${var.region}-a"]
  kubernetes_version      = var.kubernetes_version
  network                 = "${var.name}-network"
  subnetwork              = "${var.name}-subnetwork"
  ip_range_pods           = "${var.name}-secondary-ip-range-pods"
  ip_range_services       = "${var.name}-secondary-ip-range-services"
  enable_identity_service = var.enable_identity_service
  cluster_resource_labels = local.tags
  release_channel         = local.release_channel
  deletion_protection     = false
  depends_on              = [google_compute_subnetwork.network-with-private-secondary-ip-ranges]
}

resource "time_sleep" "wait" {
  depends_on      = [module.gke]
  create_duration = "180s"
}

resource "null_resource" "patch_clientconfig" {
  provisioner "local-exec" {
    command = <<EOF
kubectl --token ${nonsensitive(data.google_client_config.default.access_token)} --server "https://${module.gke.endpoint}" --insecure-skip-tls-verify patch clientconfig default -n kube-public --type merge -p '{"spec":{"authentication":[{"name":"oidc","oidc":{"clientID":"xxxx","issuerURI":"https://${var.tenant_url != "" ? var.tenant_url : format("%s", var.domain)}/auth/realms/${var.tenant}","kubectlRedirectURI":"http://localhost:8000/callback","userClaim":"sub","userPrefix":"-"}}]}}'
EOF
  }
  depends_on = [time_sleep.wait]
}

resource "google_container_node_pool" "node_pools" {
  for_each   = zipmap([for i in range(length(local.gpus[var.gpu_installation_type])) : i], local.gpus[var.gpu_installation_type])
  name       = "node-pool-${each.key}"
  cluster    = module.gke.name
  project    = var.project
  location   = var.region
  node_count = each.value.gpuNodesCount
  node_config {
    preemptible     = true
    machine_type    = var.gpu_installation_type == "fake" ? var.instance_type : each.value.gpuInstanceType
    service_account = google_service_account.default.email
    disk_size_gb    = var.volume_size
    disk_type       = var.disk_type
    image_type      = var.gpu_installation_type == "fake" ? var.ubuntu_image_type : var.cos_image_type
    spot            = var.use_spot_instances
    guest_accelerator {
      type  = var.gpu_installation_type == "real" ? each.value.gpuProduct : ""
      count = var.gpu_installation_type == "real" ? each.value.gpuCount : 0
    }
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]
  }
}

resource "null_resource" "admin_user" {
  depends_on = [module.gke]
  provisioner "local-exec" {
    command = <<EOT
kubectl --token ${nonsensitive(data.google_client_config.default.access_token)} --server "https://${module.gke.endpoint}" --insecure-skip-tls-verify apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ${var.admin_user_name}
  namespace: ${var.admin_user_namespace}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ${var.admin_user_name}
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: User
  name: admin
  apiGroup: rbac.authorization.k8s.io 
- kind: ServiceAccount
  name: ${var.admin_user_name}
  namespace: ${var.admin_user_namespace}
- kind: Group
  name: system:masters
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: Secret
metadata:
  name: ${var.admin_user_name}
  namespace: ${var.admin_user_namespace}
  annotations:
    kubernetes.io/service-account.name: ${var.admin_user_name}
type: kubernetes.io/service-account-token
EOF
sleep 10
    EOT
  }
}

data "external" "get_admin_user_token" {
  program    = ["sh", "-c", "kubectl --token ${nonsensitive(data.google_client_config.default.access_token)} --server https://${module.gke.endpoint} --insecure-skip-tls-verify get secret -n ${var.admin_user_namespace} ${var.admin_user_name} -o jsonpath='{.data}'"]
  depends_on = [null_resource.admin_user]
}

Terraform Version

1.3.7

Terraform Provider Versions

6.20.0

Additional information

No response

@yossig-runai yossig-runai added the bug Something isn't working label Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant