Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use D: for Windows installations on Azure #2076

Merged
merged 12 commits into from
Oct 15, 2024

Conversation

jaimergp
Copy link
Member

@jaimergp jaimergp commented Oct 3, 2024

Checklist

  • Added a news entry
  • Regenerated schema JSON if schema altered (python conda_smithy/schema.py)

The D: drive is supposed to be faster, so let's see if that's true!

@jaimergp
Copy link
Member Author

jaimergp commented Oct 3, 2024

Yes, this cuts the Miniforge installation time in half. Just installing Miniforge (not counting the mamba updates we do on top later) takes 1m10s - 1m30s (seeing times for this run of the pyside2 feedstock). Changing C: to D: means we can do it in 30s.

The install times for base are:

For C, 1m40s:

2024-10-03T14:17:59.8193862Z Installing dependencies
2024-10-03T14:19:40.0923480Z Setting up configuration

For D, 54s:

2024-10-03T15:02:08.0620513Z Installing dependencies
2024-10-03T15:03:02.2620718Z Setting up configuration

So, in total:

  • C takes: 1m17s (Miniforge install) + 1m40s: 2m57s
  • D takes: 37s (Miniforge install) + 54s (deps): 1m31s

@wolfv
Copy link
Member

wolfv commented Oct 3, 2024

Can we also run builds and all on D:/? E.g. set CONDA_BLD=D:/bld?

@jaimergp
Copy link
Member Author

jaimergp commented Oct 3, 2024

@isuruf
Copy link
Member

isuruf commented Oct 3, 2024

For context, we used C: for Miniforge so that pkgs_dirs ends up in C: due to C: having more space and D: for conda-build PATH because D: is faster. We also have a switch to build in C: for feedstocks that run out of space in D:.

Not sure if switching everything to D: would make builds run out of space for more feedstocks.

@jaimergp
Copy link
Member Author

jaimergp commented Oct 3, 2024

Hm, I feared as much. Maybe this should be configurable. I doubt most feedstocks are running out of space, but for sure all of them are installing the same base environment. We can shave a few minutes of those with D: + micromamba.

As long as this is documented next to the "free_disk_space" options I think we should be fine, wdyt? If that's agreeable, then I'll put some settings.

@jaimergp
Copy link
Member Author

jaimergp commented Oct 3, 2024

I'm also thinking, having the cache in C: and BLD in D:... wouldn't that cause copies across drives? I don't think Windows can hardlink across drives 🤔 That would also cause slowdowns and diminish the benefits of saving space. Maybe it makes sense in some extreme cases with multi-outputs only, not sure.

@isuruf
Copy link
Member

isuruf commented Oct 3, 2024

Yeah, that makes sense. The only issue here would be that feedstocks that customized CONDA_BLD_PATH to C: now has to customize MINIFORGE_HOME too. Maybe we do that in a migrator first?

@jaimergp
Copy link
Member Author

jaimergp commented Oct 3, 2024

There are only 12 of them, according to the search. Could be faster to just send them manually, and I'm happy to do so :D

@isuruf
Copy link
Member

isuruf commented Oct 3, 2024

Thanks. Note that those feedstocks are the heaviest loads, so keep an eye on the CI if you can.

@jaimergp
Copy link
Member Author

jaimergp commented Oct 3, 2024

If only Azure respected the [skip ci] stuff :( We can cancel the runs on Azure, but I don't have admin permissions there, so if you see it's a problem, please feel free to cancel the heaviest ones as you see fit! Thanks and sorry!

@jakirkham
Copy link
Member

FWIW in PR ( #1966 ) has switched to using PowerShell to download Miniforge, which helped a little bit. Though the main push of that PR was to free more space on the image (as space is a constant challenge). Also please see issue ( #1949 ) about space limitations

@jaimergp
Copy link
Member Author

jaimergp commented Oct 3, 2024

FWIW in PR ( #1966 ) has switched to using PowerShell to download Miniforge, which helped a little bit.

I think PowerShell has a non-negligible start time, right? We can compare against the certutil hack though. The important bit is to have it in the bat script though; the download tool itself is not as important.

@jakirkham
Copy link
Member

jakirkham commented Oct 3, 2024

I don't think so. The end-to-end time of the PowerShell download was pretty fast

In any event we are currently calling Python to do that download, which has its own startup time

Also not sure what kind of security measures are applied on Azure. Depending on what they are, these can add overhead to startup, runtime, downloads, etc.

If we are especially concerned with startup time, we could use Batch for downloading. The syntax is a little unpleasant. Though hopefully it is write once

If the goal here is just generally to improve CI startup time, it might be worth revisiting Windows Docker images ( conda-forge/docker-images#209 ). With that we could do all of the download, configuration, etc. in the image. Then CI need only download the image and perhaps update before building. If Azure has a particular registry it likes, we could also push to that registry for faster downloading (and maybe opt-in to earlier scanning to avoid this cost in CI jobs)

@jaimergp
Copy link
Member Author

jaimergp commented Oct 4, 2024

I don't think so. The end-to-end time of the PowerShell download was pretty fast

In any event we are currently calling Python to do that download, which has its own startup time

Also not sure what kind of security measures are applied on Azure. Depending on what they are, these can add overhead to startup, runtime, downloads, etc.

If we are especially concerned with startup time, we could use Batch for downloading. The syntax is a little unpleasant. Though hopefully it is write once

Yea, using certutil is not that bad: https://github.com/conda-forge/conda-smithy/pull/2075/files#diff-69e88e5a41c617697582fa7d0b6a32d712995f682e7cac24b517d4b8cc4ff9b8R28

@jakirkham
Copy link
Member

Yea, using certutil is not that bad: https://github.com/conda-forge/conda-smithy/pull/2075/files#diff-69e88e5a41c617697582fa7d0b6a32d712995f682e7cac24b517d4b8cc4ff9b8R28

That looks nice! I would be happy with that 😄

@beckermr
Copy link
Member

Shall we merge @conda-forge/core?

@jakirkham
Copy link
Member

Honestly am a little nervous about the disk space usage discussion above. Mention this as we are running into low memory warnings on Windows more frequently ( for example: conda-forge/numpy-feedstock#334 ). So would like to better understand how changing the default drive will affect this (given a swapfile may be in use)

@beckermr
Copy link
Member

beckermr commented Oct 14, 2024

@jakirkham in your estimation is numpy a typical build or more on the extreme end of things?

I was hoping that for more pedestrian builds, we'd be just fine, and that the fall out would only be for a few of the very large builds.

@jakirkham
Copy link
Member

My impression is NumPy is pretty low in resource usage (though I could always be missing something like a problematic test)

Certainly higher resource usage feedstocks are seeing this warning too

@wolfv
Copy link
Member

wolfv commented Oct 14, 2024

How about we add a knowledge base entry on how to change this when you encounter these warnings?

@beckermr
Copy link
Member

The docs will update from the JSON schema but we could add more useful text for sure.

I am surprised we consider numpy "low" on resource usage when building. The numpy windows builds run for ~25 minutes and include 5 minutes of testing. That seems high to me compared to a more typical 5 or 10 minute build.

@jakirkham
Copy link
Member

For a compiled build, yes I would say the lower end. NumPy is C only (a little Cython) with a pretty simple test suite. No C++

Personally would consider high to be something like PyTorch where we just need a different CI provider altogether. We have quite a few things like

Perhaps you can share your definition?

@beckermr
Copy link
Member

I don't have one and so am happy with whatever. I am surprised but I guess this is how it is. So should we merge or wait?

@jaimergp
Copy link
Member Author

Worst thing that can happen is that folks will experience disk issues, we'll hear about it in the Element room, and we will flip the default so we keep installing to C: even if slower. But that's hypothetical and maybe it's not that bad. Low memory is not necessarily low disk even with swaps considered. I'd say let's merge and see?

@h-vetinari
Copy link
Member

Honestly am a little nervous about the disk space usage discussion above. Mention this as we are running into low memory warnings on Windows more frequently

The disk space situation is independent from potential memory shortage (well, except in the most extreme builds where we manually set up a swapfile to use some diskspace to boost too-low memory and end up nearly running out of both, but that's neither common, nor the case of numpy).

@jaimergp
Copy link
Member Author

jaimergp commented Oct 15, 2024

There are some docs in the schema already, and those will show up in this page (although that's a bit hidden under AzureConfig foldable), so the search engine will pick it up at some point.

We can additionally set up a KB entry if we feel that's necessary. I'm inclined to do so when we start working on this new suggestion for Dev Drives.

I took care of the 12 C:\Miniconda PRs above, so this is now ready for merge, which will unblock the Micromamba PR.

@jaimergp
Copy link
Member Author

To whomever merges this, consider checking / merging the equivalent PR for staged-recipes: conda-forge/staged-recipes#27767 🙏

@beckermr
Copy link
Member

I think we have enough consensus to merge. Once we release if we are seeing a lot of issues we can make adjustments.

@beckermr beckermr merged commit 14e84b7 into conda-forge:main Oct 15, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants