-
-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use D: for Windows installations on Azure #2076
Conversation
Yes, this cuts the Miniforge installation time in half. Just installing Miniforge (not counting the mamba updates we do on top later) takes 1m10s - 1m30s (seeing times for this run of the pyside2 feedstock). Changing C: to D: means we can do it in 30s. The install times for base are: For C, 1m40s:
For D, 54s:
So, in total:
|
Can we also run builds and all on |
That's already the case :D See |
For context, we used Not sure if switching everything to |
Hm, I feared as much. Maybe this should be configurable. I doubt most feedstocks are running out of space, but for sure all of them are installing the same base environment. We can shave a few minutes of those with D: + micromamba. As long as this is documented next to the "free_disk_space" options I think we should be fine, wdyt? If that's agreeable, then I'll put some settings. |
I'm also thinking, having the cache in C: and BLD in D:... wouldn't that cause copies across drives? I don't think Windows can hardlink across drives 🤔 That would also cause slowdowns and diminish the benefits of saving space. Maybe it makes sense in some extreme cases with multi-outputs only, not sure. |
Yeah, that makes sense. The only issue here would be that feedstocks that customized CONDA_BLD_PATH to C: now has to customize MINIFORGE_HOME too. Maybe we do that in a migrator first? |
There are only 12 of them, according to the search. Could be faster to just send them manually, and I'm happy to do so :D |
Thanks. Note that those feedstocks are the heaviest loads, so keep an eye on the CI if you can. |
If only Azure respected the |
I think PowerShell has a non-negligible start time, right? We can compare against the |
I don't think so. The end-to-end time of the PowerShell download was pretty fast In any event we are currently calling Python to do that download, which has its own startup time Also not sure what kind of security measures are applied on Azure. Depending on what they are, these can add overhead to startup, runtime, downloads, etc. If we are especially concerned with startup time, we could use Batch for downloading. The syntax is a little unpleasant. Though hopefully it is write once If the goal here is just generally to improve CI startup time, it might be worth revisiting Windows Docker images ( conda-forge/docker-images#209 ). With that we could do all of the download, configuration, etc. in the image. Then CI need only download the image and perhaps update before building. If Azure has a particular registry it likes, we could also push to that registry for faster downloading (and maybe opt-in to earlier scanning to avoid this cost in CI jobs) |
Yea, using |
That looks nice! I would be happy with that 😄 |
Shall we merge @conda-forge/core? |
Honestly am a little nervous about the disk space usage discussion above. Mention this as we are running into low memory warnings on Windows more frequently ( for example: conda-forge/numpy-feedstock#334 ). So would like to better understand how changing the default drive will affect this (given a swapfile may be in use) |
@jakirkham in your estimation is numpy a typical build or more on the extreme end of things? I was hoping that for more pedestrian builds, we'd be just fine, and that the fall out would only be for a few of the very large builds. |
My impression is NumPy is pretty low in resource usage (though I could always be missing something like a problematic test) Certainly higher resource usage feedstocks are seeing this warning too |
How about we add a knowledge base entry on how to change this when you encounter these warnings? |
The docs will update from the JSON schema but we could add more useful text for sure. I am surprised we consider numpy "low" on resource usage when building. The numpy windows builds run for ~25 minutes and include 5 minutes of testing. That seems high to me compared to a more typical 5 or 10 minute build. |
For a compiled build, yes I would say the lower end. NumPy is C only (a little Cython) with a pretty simple test suite. No C++ Personally would consider high to be something like PyTorch where we just need a different CI provider altogether. We have quite a few things like Perhaps you can share your definition? |
I don't have one and so am happy with whatever. I am surprised but I guess this is how it is. So should we merge or wait? |
Worst thing that can happen is that folks will experience disk issues, we'll hear about it in the Element room, and we will flip the default so we keep installing to |
The disk space situation is independent from potential memory shortage (well, except in the most extreme builds where we manually set up a swapfile to use some diskspace to boost too-low memory and end up nearly running out of both, but that's neither common, nor the case of numpy). |
There are some docs in the schema already, and those will show up in this page (although that's a bit hidden under We can additionally set up a KB entry if we feel that's necessary. I'm inclined to do so when we start working on this new suggestion for Dev Drives. I took care of the 12 |
To whomever merges this, consider checking / merging the equivalent PR for staged-recipes: conda-forge/staged-recipes#27767 🙏 |
I think we have enough consensus to merge. Once we release if we are seeing a lot of issues we can make adjustments. |
Checklist
news
entrypython conda_smithy/schema.py
)The D: drive is supposed to be faster, so let's see if that's true!