refactor: Upgrade to v18 of EKS module #64

bryantbiggs · 2022-06-09T15:48:22Z

This updates the project to use the latest v18 EKS module in a similar manner to what was defined.

Managed node groups are used instead of self-managed node groups for the number of benefits they provide users when operating EKS clusters.
The lockfile was removed and an entry added to the gitignore file - this is going to be specific to how users initialize their project so I don't believe its valid here
See other comments below for more context

Motiviation

We would like to move users away from v17 and any newcomers should start off using v18. I don't know where the associated article repo is located but I suspect it will need to be updated to reflect these changes https://learn.hashicorp.com/tutorials/terraform/eks

Closes #14
Closes #19
Closes #22
Closes #25
Closes #28
Closes #29
Closes #31
Closes #33
Closes #35
Closes #42
Closes #43
Closes #51
Closes #56
Closes #61
Closes #63

Please let me know if there are any questions or concerns

bryantbiggs · 2022-06-09T15:49:58Z

eks-cluster.tf

-  workers_group_defaults = {
-    root_volume_type = "gp2"
+  # Extend cluster security group rules
+  cluster_security_group_additional_rules = {


The security group rules added here put the definition back inline with the v17 version. In v18 the security group rules were reduced to only the bare minimum required for a cluster to provision successfully while allowing users to open/extend access as they see fit for their workload

bryantbiggs · 2022-06-09T15:51:19Z

main.tf

+  host                   = module.eks.cluster_endpoint
+  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
+
+  exec {


The exec plugin method is recommended for EKS by the provider to avoid token expiration https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#exec-plugins

bryantbiggs · 2022-06-09T15:52:14Z

main.tf

@@ -1,12 +1,31 @@
 # Kubernetes provider


Its common practice to have a generic main.tf for bits that are generic and not resource specific. This captures the providers, common local/data sources, and the random string resource used in the naming scheme

bryantbiggs · 2022-06-09T15:53:36Z

outputs.tf

  value       = module.eks.cluster_security_group_id
 }

-output "kubectl_config" {


this output or an equivalent is no longer provided in v18; users are able to retrieve this through the awscli:

aws eks update-kubeconfig --region <REGION> --name <CLUSTER_NAME>

bryantbiggs · 2022-06-09T15:54:10Z

security-groups.tf

@@ -28,20 +27,3 @@ resource "aws_security_group" "worker_group_mgmt_two" {
    ]
  }
 }
-
-resource "aws_security_group" "all_worker_mgmt" {


This resource looks like it was never used anywhere so removed

bryantbiggs · 2022-06-09T15:55:39Z

security-groups.tf

-
-resource "aws_security_group" "worker_group_mgmt_one" {
-  name_prefix = "worker_group_mgmt_one"
+resource "aws_security_group" "node_group_one" {


naming modified to reflect what is used by the module and AWS. Instead of "workers" its "self-managed node groups", instead of "node groups" its "EKS managed node groups". Here I have just used the generic "node group" for brevity

bryantbiggs · 2022-06-09T15:55:56Z

versions.tf

    }

    random = {
      source  = "hashicorp/random"
      version = "3.1.0"
    }

-    local = {


local and null providers not utilized here so removed

bryantbiggs · 2022-06-09T15:56:20Z

versions.tf

@@ -2,27 +2,17 @@ terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
-      version = ">= 3.20.0"
+      version = ">= 3.72"


not really pertinent here, but setting to match the module's requirements

bryantbiggs · 2022-06-09T15:56:56Z

vpc.tf

-  cluster_name = "education-eks-${random_string.suffix.result}"
-}
+  cidr = "10.0.0.0/16"
+  azs  = slice(data.aws_availability_zones.available.names, 0, 3)


adding slice() since some regions have more than 3 AZs and we want to ensure we only pull 3 to match the subnets prescribed

bryantbiggs · 2022-06-09T15:57:10Z

vpc.tf

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

-  tags = {


not required, removed

BrewedCoffee · 2022-06-10T18:28:59Z

Thanks for doing this :) would have definitely helped me a few weeks ago, and I'm sure it'll help hundreds or thousands to come!

someshkoli · 2022-06-13T19:39:09Z

@bryantbiggs can you point me to the change where its fixing terraform-aws-modules/terraform-aws-eks#978 ?

bryantbiggs · 2022-06-13T19:44:28Z

@bryantbiggs can you point me to the change where its fixing terraform-aws-modules/terraform-aws-eks#978 ?

@someshkoli https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/main.tf#L354-L459

bryantbiggs · 2022-06-18T14:55:33Z

@im2nguyen any thoughts on this change?

im2nguyen · 2022-06-18T15:27:16Z

Hi @bryantbiggs, thank you for updating this!

@judithpatudith can you have the team look into updating the tutorial? this is a high value update

alanszlosek · 2022-07-01T18:48:00Z

Thank you @bryantbiggs, this is a fantastic contribution!

I'm on the education team at HashiCorp and am hoping to update our EKS tutorial to work with the changes you've made in this PR. Using version 18.24.1 of the EKS module and version 4.21.0 of the AWS provider the infrastructure deploys just fine (I upgraded to get rid of some TF warnings). But I'm unable to access the kubernetes dashboard through the kubectl proxy. Can you see whether it works for you?

I can make other k8s API calls (like curl http://127.0.0.1:8001/api/v1/namespaces/kubernetes-dashboard/services/) through the proxy, but when I try to visit the dashboard URL it spins and then eventually returns this:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "error trying to reach service: dial tcp 10.0.1.76:8443: connect: connection timed out",
  "reason": "ServiceUnavailable",
  "code": 503
}

I have a hunch it has something to do with security groups, but I'm not familiar enough with EKS to know what to modify. Do you have any ideas?

Thanks again for your help, we really appreciate it.

bryantbiggs · 2022-07-01T18:49:14Z

ah yes, I think I know what it is - let me try on my end and update the PR shortly

… kubernetes-dashboard

bryantbiggs · 2022-07-06T21:10:52Z

validation steps:

terraform init && terraform apply
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep service-controller-token | awk '{print $1}')
Copy token from step 2
kubectl proxy
Open http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:https/proxy/
Paste token

Please note that the project now provisions the metrics-server and kubernetes-dashboard through the Helm provider. The versions of those resource used in the current guide are quite a bit out of date and now there are less manual tasks required by users (unless that is desired)

cc @alanszlosek

alanszlosek · 2022-07-12T17:36:59Z

Thanks, @bryantbiggs, this is great! Those changes fixed the proxy issue. Based on what I'm reading, attach_cluster_primary_security_group attaches the cluster security group to all servers within the cluster and permits all traffic between any with that SG. I'm guessing that change fixed the proxy issue by allowing ingress to nodes FROM non-node (control plane?) servers. Whereas in earlier PR commits, nodes were only permitting connections from servers with the node SG and nothing else. Am I understanding things correctly?

Switching gears ...

I've been talking to my team about the k8s dashboard installation section of the tutorial. The goal of the section was to "verify" the cluster, but we think it would be sufficient and simpler to have the user run kubectl cluster-info and kubectl get nodes instead. What are your thoughts?

If you agree, this means we can remove the helm provider as well as the service account, role binding, metrics server, and dashboard resources from the PR entirely. I think you can revert some of your changes to the kubernetes provider as well.

Lastly, can you restore the terraform lock file? It's a recommendation within our docs, and it helps us ensure that learners don't encounter any version-related hiccups while going through a tutorial. Does that make sense?

Again, thank you!

bryantbiggs · 2022-07-15T12:16:59Z

I'm guessing that change fixed the proxy issue by allowing ingress to nodes FROM non-node (control plane?) servers. Whereas in earlier PR commits, nodes were only permitting connections from servers with the node SG and nothing else. Am I understanding things correctly?

Close - the EKS module in v18 was designed to only contain the bare minimum security group requirements to bring up a cluster successfully. Users then have the option to start opening up additional access to support the workloads they run. Its allowing users to start from a secure, least-access perspective and only opening up access for their specific workloads to maintain a stronger security posture.

Using this cluster primary security group means that all traffic among the resources attached to the security is permitted, and all egress to the public internet is allowed. This is the EKS "default behavior" when a security group isn't provided to nodes.

If you agree, this means we can remove the helm provider as well as the service account, role binding, metrics server, and dashboard resources from the PR entirely.

Absolutely. I wasn't sure of the intentions so just maintained the status quo but have now removed those

Lastly, can you restore the terraform lock file?

Done! @alanszlosek

alanszlosek · 2022-07-20T14:52:31Z

@bryantbiggs, that looks great. I'm going to merge this into a testing branch, update the tutorial text, and test it all end-to-end before release. Thanks for helping us make things better!

bryantbiggs · 2022-07-20T15:00:18Z

Thank you @alanszlosek - let me know if I can be of any further assistance or if you have any questions on the module usage

Also, I think you can probably safely close out most, if not all, of the open issues/PRs with this. Not sure why it didn't automatically close them on this PR

Edit: Ah, nvm - I see it went into a temp branch and not main so they wouldn't be automatically closed on merge

alanszlosek · 2022-07-21T14:13:13Z

Thanks for your willingness, @bryantbiggs . I do have a question about the manage_aws_auth_configmap param. The EKS docs say this:

The aws-auth ConfigMap is automatically created and applied to your cluster when you create a managed node group or when you create a node group using eksctl.

Assuming I'm reading it correctly, it implies we shouldn't need Terraform to manage the ConfigMap, at least not for the tutorial. I'm hoping the mappings for each node group's IAM Role will be added automatically to the cluster when Terraform creates each group.

I know it's just one param, but if we don't need it, then I believe I can also remove the exec from the kubernetes provider. I'm on the lookout for additional ways to make the config simpler. :-)

Thanks for the heads-up on the Issues that can be closed. I'll make a note to close them.

bryantbiggs · 2022-07-21T14:19:25Z

Correct - with EKS managed node groups and Fargate profiles, those are automatically added to the configmap. If you want to remove that line you can also remove the Kubernetes provider block as well

refactor: Upgrade to v18 of EKS module

724d310

bryantbiggs commented Jun 9, 2022

View reviewed changes

chore: fix comment

37d5e19

bryantbiggs added 2 commits July 6, 2022 16:44

fix: Use cluster security group for access and add metrics-server and…

1ddc747

… kubernetes-dashboard

feat: Provision RBAC with K8s provider

26181c4

refactor: Remove dashboard, manifests, and helm provider

775ed21

alanszlosek changed the base branch from main to update-eks-module-18.0.4 July 20, 2022 14:45

alanszlosek changed the base branch from update-eks-module-18.0.4 to update-eks-module-v18 July 20, 2022 14:46

alanszlosek merged commit 127b9fe into hashicorp:update-eks-module-v18 Jul 20, 2022

bryantbiggs deleted the refactor/v18-upgrade branch July 20, 2022 15:00

BrianMMcClain mentioned this pull request Jan 5, 2023

Fix it to work after terraform-aws-modules/eks/aws 18 release #56

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Upgrade to v18 of EKS module #64

refactor: Upgrade to v18 of EKS module #64

bryantbiggs commented Jun 9, 2022 •

edited

Loading

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

bryantbiggs Jun 9, 2022

BrewedCoffee commented Jun 10, 2022

someshkoli commented Jun 13, 2022

bryantbiggs commented Jun 13, 2022

bryantbiggs commented Jun 18, 2022

im2nguyen commented Jun 18, 2022

alanszlosek commented Jul 1, 2022

bryantbiggs commented Jul 1, 2022

bryantbiggs commented Jul 6, 2022 •

edited

Loading

alanszlosek commented Jul 12, 2022 •

edited

Loading

bryantbiggs commented Jul 15, 2022

alanszlosek commented Jul 20, 2022

bryantbiggs commented Jul 20, 2022 •

edited

Loading

alanszlosek commented Jul 21, 2022

bryantbiggs commented Jul 21, 2022

refactor: Upgrade to v18 of EKS module #64

refactor: Upgrade to v18 of EKS module #64

Conversation

bryantbiggs commented Jun 9, 2022 • edited Loading

Motiviation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BrewedCoffee commented Jun 10, 2022

someshkoli commented Jun 13, 2022

bryantbiggs commented Jun 13, 2022

bryantbiggs commented Jun 18, 2022

im2nguyen commented Jun 18, 2022

alanszlosek commented Jul 1, 2022

bryantbiggs commented Jul 1, 2022

bryantbiggs commented Jul 6, 2022 • edited Loading

alanszlosek commented Jul 12, 2022 • edited Loading

bryantbiggs commented Jul 15, 2022

alanszlosek commented Jul 20, 2022

bryantbiggs commented Jul 20, 2022 • edited Loading

alanszlosek commented Jul 21, 2022

bryantbiggs commented Jul 21, 2022

bryantbiggs commented Jun 9, 2022 •

edited

Loading

bryantbiggs commented Jul 6, 2022 •

edited

Loading

alanszlosek commented Jul 12, 2022 •

edited

Loading

bryantbiggs commented Jul 20, 2022 •

edited

Loading