Skip to content

[azure] support latest version for image id in azure cloud#4581

Merged
cg505 merged 4 commits intoskypilot-org:masterfrom
Yisaer:az_image
Jan 27, 2025
Merged

[azure] support latest version for image id in azure cloud#4581
cg505 merged 4 commits intoskypilot-org:masterfrom
Yisaer:az_image

Conversation

@Yisaer
Copy link
Copy Markdown
Contributor

@Yisaer Yisaer commented Jan 17, 2025

As titled.

close #4435

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)

I manually tested with following task:

resources:
  cloud: azure
  region: japaneast
  instance_type: Standard_D2s_v3
  image_id: Canonical:ubuntu-24_04-lts:server:latest

# Working directory (optional) containing the project codebase.
# Its contents are synced to ~/sky_workdir/ on the cluster.
workdir: .

# Typical use: pip install -r requirements.txt
# Invoked under the workdir (i.e., can use its files).
setup: |
  echo "Running setup."

# Typical use: make use of resources, such as running training.
# Invoked under the workdir (i.e., can use its files).
run: |
  echo "Hello, SkyPilot!"
  conda env list

The cluster was launched and the job work well. I checked the image in the azure console, and it was ubuntu-24_04-lts as expected

  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

Signed-off-by: Song Gao <disxiaofei@163.com>
Copy link
Copy Markdown
Collaborator

@cg505 cg505 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, @Yisaer! This will be a useful feature. I appreciate you digging into this problem because the Azure docs are kind of unclear about it. Left a comment to check on.

Comment thread sky/clouds/azure.py Outdated
publisher_name=publisher,
offer=offer,
skus=sku)
latest_version = max(versions, key=lambda x: x.name)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you confirm that the name field here is the image version number? I assume it is a string, so will max() work correctly? E.g. we need to make sure that 1.0.10 > 1.0.2.

Copy link
Copy Markdown
Contributor Author

@Yisaer Yisaer Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the max() problem, it should use packaging.version to parse the image version here.
Also I tried to find out the meaning of the name in the azure document, but it's not clear enough.
So when I tested my pr, I debug the returned versions during the runtime, and the returned name here was acutuall the image version number.

Signed-off-by: Song Gao <disxiaofei@163.com>
@Yisaer Yisaer requested a review from cg505 January 20, 2025 07:10
@cg505
Copy link
Copy Markdown
Collaborator

cg505 commented Jan 22, 2025

@Yisaer Just checking - we don't need this in make_deploy_resources_variables as well? Just for the image size check?

@Yisaer
Copy link
Copy Markdown
Contributor Author

Yisaer commented Jan 23, 2025

@Yisaer Just checking - we don't need this in make_deploy_resources_variables as well? Just for the image size check?

@cg505 Yes, it appears that compute_client.virtual_machine_images.get does not accept the latest image version, whereas compute_client.virtual_machines.begin_create_or_update does.

By querying the image size through a precise image version, while directly using the latest image version when creating a VM, there might be a difference in image information. However, the probability of this occurring is very small, and I think this is an acceptable solution.

@cg505 cg505 requested a review from Michaelvll January 23, 2025 04:02
Copy link
Copy Markdown
Collaborator

@cg505 cg505 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, it looks good to me! Let's have @Michaelvll review as well.

@Michaelvll
Copy link
Copy Markdown
Collaborator

@Yisaer Just checking - we don't need this in make_deploy_resources_variables as well? Just for the image size check?

@cg505 Yes, it appears that compute_client.virtual_machine_images.get does not accept the latest image version, whereas compute_client.virtual_machines.begin_create_or_update does.

By querying the image size through a precise image version, while directly using the latest image version when creating a VM, there might be a difference in image information. However, the probability of this occurring is very small, and I think this is an acceptable solution.

Thanks for the explanation! We should mention this in the comment as well.

Signed-off-by: Song Gao <disxiaofei@163.com>
@Yisaer Yisaer requested a review from cg505 January 25, 2025 11:31
Comment thread hello_sky.yaml Outdated
Signed-off-by: Song Gao <disxiaofei@163.com>
@Yisaer Yisaer requested a review from cg505 January 26, 2025 08:07
Copy link
Copy Markdown
Collaborator

@cg505 cg505 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

@cg505 cg505 merged commit 62228ce into skypilot-org:master Jan 27, 2025
@Yisaer Yisaer deleted the az_image branch July 20, 2025 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Azure image-id from marketplace with :latest fails

3 participants