Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ListNodeImageVersions + shared image gallery support #526

Merged
merged 48 commits into from
Dec 21, 2024

Conversation

Bryce-Soghigian
Copy link
Collaborator

@Bryce-Soghigian Bryce-Soghigian commented Oct 16, 2024

Fixes #

Description
This pr introduces

  1. USE_SIG option that is used to control if we should or should not use shared image galleries
  2. Shared Image Gallery ID Resolution
  3. New ListNodeImageVersions logic for getting the latest approved agentbaker node image version rather than using galleries as a source of truth

Open Qs:
Are we ok with all Nodes with Community Image gallery node images being drifted over to use SIG in managed karpenter with this change?
Should we implement the windows side changes too for the NodeImageVersions api?

How was this change tested?

  • manual testing on MSFT Tenant SIG Galleries for Ubuntu and Azure Linux
  • make az-all
  • make test

Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: #
  • No

Release Note


@coveralls
Copy link

coveralls commented Oct 16, 2024

Pull Request Test Coverage Report for Build 12442803767

Details

  • 147 of 202 (72.77%) changed or added relevant lines in 13 files are covered.
  • 5 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.06%) to 95.42%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/providers/imagefamily/types.go 8 12 66.67%
pkg/providers/imagefamily/image.go 41 47 87.23%
pkg/providers/instance/azure_client.go 1 7 14.29%
pkg/providers/imagefamily/nodeimageversionsclient.go 36 75 48.0%
Files with Coverage Reduction New Missed Lines %
pkg/cloudprovider/drift.go 1 57.14%
pkg/providers/instance/azure_client.go 1 15.0%
pkg/providers/imagefamily/image.go 1 83.15%
pkg/fake/pricingapi.go 2 92.31%
Totals Coverage Status
Change from base Build 12438947198: -0.06%
Covered Lines: 48132
Relevant Lines: 50442

💛 - Coveralls

@Bryce-Soghigian Bryce-Soghigian marked this pull request as ready for review October 16, 2024 06:35
Copy link
Collaborator

@tallaxes tallaxes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pease resolve merge conflicts (which should include removal of AKSNodeClass imageVersion)

pkg/providers/imagefamily/image.go Outdated Show resolved Hide resolved
pkg/providers/imagefamily/image.go Outdated Show resolved Hide resolved
pkg/providers/imagefamily/image.go Outdated Show resolved Hide resolved
pkg/providers/imagefamily/image.go Outdated Show resolved Hide resolved
pkg/providers/instance/azure_client.go Show resolved Hide resolved
pkg/operator/options/options.go Outdated Show resolved Hide resolved
pkg/providers/imagefamily/image.go Outdated Show resolved Hide resolved
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 19 changed files in this pull request and generated 1 suggestion.

Files not reviewed (14)
  • Makefile-az.mk: Language not supported
  • pkg/providers/imagefamily/image_test.go: Evaluated as low risk
  • pkg/cloudprovider/drift.go: Evaluated as low risk
  • pkg/test/options.go: Evaluated as low risk
  • pkg/fake/nodeimageversionsapi.go: Evaluated as low risk
  • pkg/providers/instancetype/suite_test.go: Evaluated as low risk
  • pkg/providers/instance/instance.go: Evaluated as low risk
  • pkg/providers/instance/azure_client.go: Evaluated as low risk
  • pkg/test/environment.go: Evaluated as low risk
  • karpenter-values-template.yaml: Evaluated as low risk
  • pkg/providers/imagefamily/azlinux.go: Evaluated as low risk
  • pkg/operator/operator.go: Evaluated as low risk
  • pkg/providers/imagefamily/ubuntu_2204.go: Evaluated as low risk
  • pkg/providers/imagefamily/types.go: Evaluated as low risk
Comments skipped due to low confidence (1)

pkg/providers/imagefamily/image.go:132

  • Ensure that the new string splitting approach correctly handles all possible cases by adding test cases.
imageID, ok := p.imageCache.Get(key)

Copy link
Collaborator

@tallaxes tallaxes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good and ready to go (caching may be further simplified, I think, but that's minor), but E2E tests are failing:

https://github.com/Azure/karpenter-provider-azure/actions/runs/12309122786/job/34355672677

Error: variable ${SIG_SUBSCRIPTION_ID} not set

…Azure/karpenter-provider-azure into bsoghigian/list-node-image-versions-poc
@tallaxes
Copy link
Collaborator

@Bryce-Soghigian please link to evidence of E2E tests passing

@Bryce-Soghigian
Copy link
Collaborator Author

@tallaxes

We can see the utilization + ACR tests are selecting v6 skus, which as we have been over this functionality is currently broken. 
See: https://github.com/Azure/karpenter-provider-azure/actions/runs/12361732672/job/34731291609#step:18:130, https://github.com/Azure/karpenter-provider-azure/actions/runs/12361732672/job/34731292840#step:18:318
---

Updating the branch to include your fix to that bug and will rerun the e2es 

tallaxes
tallaxes previously approved these changes Dec 21, 2024
Copy link
Collaborator

@tallaxes tallaxes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 E2E: https://github.com/Azure/karpenter-provider-azure/actions/runs/12439573149

(Please follow up with including NVMe-only VM SKUs when Shared Image Gallery is used.)

Copy link
Collaborator

@tallaxes tallaxes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow up with #617 (comment) (could be separate PR)

@Bryce-Soghigian Bryce-Soghigian merged commit 3a9cbc0 into main Dec 21, 2024
11 of 12 checks passed
@Bryce-Soghigian Bryce-Soghigian deleted the bsoghigian/list-node-image-versions-poc branch December 21, 2024 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nap Issues or PRs related to Node Auto Provisioning (NAP) area/vm-images Issues or PRs related to VM images or image galleries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants