-
-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pin Python 3.9.16 on Cygwin CI #1814
Conversation
This uses Python 3.8.16 (provided by the Cygwin package python38 at version 3.8.16-1), to work around the problem that pip has begun to block on some PyPI package downloads when Python 3.9.18 (provided by the Cygwin package python39 at version 3.9.18-1) is used. I also tried a bunch of other stuff, which is listed below and can be examined in full detail, with all individual diffs and most CI results, at #2. * Try not installing/upgrading wheel for Cygwin CI This is for a recent problem where "pip install -U" in the virtual environment in a Cygwin test job seems to block indefinitely on downloading the wheel package itself (not other packages' wheels). * Try not upgrading/installing pip/setuptools either on Cygwin * Try installing pytho39-wheel Cygwin package Maybe this will overcome the next blockage, which is the codecov PyPI package, downloading a .tar.gz file. * Try upgrading wheel, but after upgrading pip * Try always running pip on Cygwin as "python -m pip" * Try using a venv on Cygwin * Use "python -v -m pip" to see some of what's going on * Undo venv; use "python -m pip -vvv" to see what's going on * Undo all debugging changes except passing "-vvv" * Try with "--no-cache-dir" * Try with different tmp dir for pip runs * Try with python39=3.9.16-1 * Try not upgrading setuptools * Try not installing Cygwin python39-pip package * Run pip freeze effectively This doesn't fix the bigger issue, it just addresses something from the last commit. * Try not installing python39-virtualenv either * Try giving IPv4 for files.pythonhosted.org in hosts file * Try downloading wheel with wget This is not a usable solution, but it is useful for troubleshooting. * Try with python39-pip=23.0.1-1 And don't upgrade it or other PyPI packages. * Pin pip with pip (Cygwin package doesn't pin) This tries with an older pip, but if the problem is the build rather than the version, then it would also help. * Stop pinning; keep skipping -U for PyPA; instrument with -vvv This won't fix it but is diagnostic, to reveal the URL for the coverage package, so I can see what happens when that is installed more manually. * Try installing coverage[toml] separately * Unset -vvv to see the bigger picture more easily * Try killing pip after a timeout and rerunning it * Use SIGKILL * Increase timeout from 70 to 120 seconds per try * Give each try a little more time than the last Since it has to verify previous work. * Tweak (re)try parameters * Try Python 3.8
The latest currently packaged version of Python 3.9 for Cygwin is 3.9.18 (provided by the Cygwin package python39 at version 3.9.18-1). That version, at least as we are using it, has a problem where pip has begun to block on some PyPI package downloads. In 73ebcfa (#2), I worked around this problem by downgrading the minor version of Python to 3.8. But it is better to use 3.9 if we can, since it is currently the latest minor version of Python in the Cygwin repositories, and also because (relating to that) it is used more often, and thus probably used more often with GitPython, than 3.8. This upgrades Python on Cygwin but not all the way. It upgrades it to the latest (or latest currently available) patch version of 3.9 packaged for Cygwin of those that strictly precede 3.9.18 where the problem occurs. That version is 3.9.16, provided by the Cygwin package python39 at version 3.9.16-1. This version may eventually no longer be available for download from Cygwin's repositories, so hopefully a real solution or better workaround will be found by then, or perhaps a future update to the package itself will fix the problem. I also tried some more other stuff since finding 3.8 to work in 73ebcfa. Changes since then are listed below. They can be examined in full detail, with individual diffs and CI results, at #3. * Try Python 3.9 with other details the same Changing it to Python 3.8 worked, but I want to check that it was actually the use of Python 3.8, rather than other seemingly small changes made to support using Python 3.8, that made the difference. * Revert "Try Python 3.9 with other details the same" This reverts commit b55cbfb. * Try 3.9 again, with both python39=3.9.16-1 python39-pip=23.0.1-1 * Back to 3.8; try another GitHub Action Python 3.8 worked with cygwin-install-action, but I want to make the change to setup-cygwin by itself first before trying it with 3.9, in case I am using setup-cygwin incorrectly. * Try 3.9 with this setup-cygwin action * Try pinning with setup-cygwin * Try not pinning, but no -U for PyPA, with setup-cygwin Pinning and skipping -U for PyPA packages worked. Let's see if it was really pinning that made the difference. * Try pinning just python39-pip Pinning works, and merely omitting the -U for PyPA package doesn't. Examining the output of runs that used install-cygwin-action and attemped pinning Cygwin package versions shows newer versions were installed, whereas pinning is really happening with setup-cygwin. This tries pinning just the Cygwin package for pip, rather than for Python 3.9. I don't expect this to work. * Try pinning just python39=3.9.16-1 And not pip, but this does not add back the -U for PyPA yet. * Add back -U for PyPA packages * Try pinning python39=3.9.16-1 with old action/everything This is extremely unlikely to work, I just want to check. * Try just setup-cygwin and pinning python39=3.9.16-1 That is, this puts back all the other stuff the way it was on the main branch when the breakage occurred, besides changing from cygwin-install-action to setup-cygwin to make pinning work and using it to get version 3.9.16-1 of the Cygwin python39 package.
Thanks a lot for the investigation, and for coming up with a fix after what seems like arduous work with more than 40 attempts. What surprises me the most is that I didn't see a link to an official tracking issue for this bug in what might the c-python repository - after all, a lot of people should have encountered it (but apparently didn't). Definitely still puzzling. |
I think the bug is probably in the downstream Cygwin package for Python 3.9.18, or in some other component of Cygwin but only triggered due to changes in 3.9.18. But I haven't found anything about it on the Cygwin mailing lists. I might investigate it further. Now that I can produce it locally, I could try to come up with a minimal way to produce it, then observe that with Cygwin's |
That's great to hear! I have a feeling that this bug is great for denial of service attacks against toolchains given that it spins hot forever and ignores all non-fatal signals as well. If you can track it down that might be preventing further damage to the wider internet - part of me probably can't believe that GitPython is so special it's the only project experiencing the issue. |
It turns out that the more specific trigger in Contrary to what my experiments described above had led me to think, this does actually depend to some extent on the version of pip that is used. If an old enough version is used, it draws progress bars in a different way that is not affected. In the past, pip vendored progress and used it to draw progress bars; that vendored dependency was removed in pip 22.1. Recent versions of pip use a feature of their vendored rich dependency to draw progress bars. Because a version of pip with a differently implemented progress bar doesn't stall, I suspect the problem may really be triggered by the progress bar drawing code. However, the bug still appears to be in Python 3.9.18 as packaged for Cygwin, and not in rich or pip, unless the other indefinite stalling problem, detailed below, that happens after this one is overcome, turns out, somehow, to be altogether separate. DetailsI experimented with turning off the progress bars is in 6f57fe5. In that commit's Cygwin run, it gets past I have not managed to produce that locally because I get a different effect. Locally, the malfunction is at the same point, on the way into the decorator wrapper for This happens when the decorator wrapper sets up a remote repository. Surprisingly, the problem seems unrelated to the wrapper's use of python -c 'import time, git; print(git.Repo().remote().fetch()); time.sleep(0.12); print("OK.")'
When I produce the problem interactively in a REPL or The problem might be triggered by something in It appears NumPy's CI is also affectedMy earlier web searches had not turned up other projects that had a similar problem to what this PR works around. But more recently, a search revealed numpy/numpy#25708, which seems to be the same problem, in which Python stalls on Cygwin since the upgrade from the downstream 3.9.16 to the downstream 3.9.18. |
Thanks so much for sharing this early result, a very interesting read! Can While reading, I thought "threads, pipes, subprocess spawning, lots of syscalls, forks, locks" as things that probably happen concurrently here. And the fact that it's not the whole world suffering from this probably means is something rather specific at the intersection of numpy and GitPython (and who knows which else). The silent exit 0 the could be produced locally makes me think that it can't be the work of a signal, but would be typical for what happens once a forked process exits successfully. It's like the memory of the parent of the fork gets messed with, causing it to exit instead. This shouldn't be possible with a normal Anyway, that's all I've got 😁. |
This has been found to affect Pillow's CI (as well as GitPython's and NumPy's): It has also been reported to the Cygwin mailing list (Cygwin does not use a separate bug tracker):
Yes, Daniel Abrahamsson found that When I ran the strangely terminating case described above with GitPython's It looks like terminating may be a rare manifestation of the bug, and that looping forever is much more common. When I simplified the import hashlib
import threading
import time
t1 = threading.Thread(target=lambda: print("hello"))
t2 = threading.Thread(target=lambda: print("goodbye"))
t1.start()
time.sleep(1)
print("in between")
t2.start()
t1.join()
t2.join() Key point: While Other observations:
Although I got to that by starting with experiments in GitPython, that script produces the problem outside a virtual environment (i.e., in a global environment where GitPython is not installed). I ran it from a Cygwin bash shell with: /usr/bin/python3.9 simple.py And, for strace -o strace.out /usr/bin/python3.9 simple.py By the time I killed the process in the
I made a copy of the first 6610 lines of the file in I expect some of this information may be useful to the Cygwin Python package maintainer and possibly other users/developers, and I will try to post to the Cygwin mailing list soon. In case it turns out to be useful in subsequent experimentation on GitPython, the relationship between the short script shown above and code in GitPython that runs in a These child threads may sometimes correspond to I tested with a shell script that outputs both to stdout and stderr, initially with #!/bin/bash
for i in {9..0}; do
sleep 0.1
if ((i % 2 == 0)); then
printf '%d\n' "$i"
else
printf '%d\n' "$i" >&2
fi
done For test and demonstration purposes, #!/usr/bin/env python
from subprocess import PIPE, Popen
import threading
import git
def handle_process_output(process, stdout_handler, stderr_handler):
def pump_stream(stream, handler):
try:
for line in stream:
handler(line)
finally:
stream.close()
pumps = [
(process.stdout, stdout_handler),
(process.stderr, stderr_handler),
]
threads = []
for stream, handler in pumps:
t = threading.Thread(target=pump_stream, args=(stream, handler))
t.start()
threads.append(t)
for t in threads:
t.join()
def run_job():
proc = Popen(["bash", "countdown"], stdout=PIPE, stderr=PIPE)
handle_process_output(
proc,
lambda line: print(f"stdout: {line!r}"),
lambda line: print(f"stderr: {line!r}"),
)
proc.wait()
def main():
for i in range(1, 11):
print(f"Job {i}:")
run_job()
print("Done.")
if __name__ == "__main__":
main() Notice that the |
This is downright amazing, and I am impressed by your ability to minimise the reproducer testcase! With it, it should be easy to find the culprit and fix it. And excuse my ignorance here, but is it correct that the reproducer script will only trigger on particular versions of Python? Or only when running in a Cygwin environment? I am asking because with it it should be possible to bisect the corresponding interpreter source-code to the exact breaking commit. |
The problem strongly appears specific to the intersection of Python 3.9.18 and Cygwin, and the script behaves accordingly. This is to say that I found that script only to hang on that combination of system and Python version, and this is also where I was otherwise able to observe the bug (Cygwin being a "system" for this purpose). On systems I tested other than Cygwin, it always completed successfully, printing all three messages, including with Python 3.9.18. On Cygwin, it likewise completed successfully when tested with Python 3.8.18 and 3.9.16, but showcased the problem on 3.9.18. As noted in the message I sent to the Cygwin mailing list, most of my testing was on my local machine, but it behaves accordingly on GitHub Actions. The maintainer will likely work on the issue soon. (Although it was after my list message that I added native Windows to the CI experiment, the situation with it is still essentially as described there. I have not built unpatched versions of Python 3.9.16 and 3.9.18 for Windows, and there are no official python.org Windows builds for 3.9 past 3.9.13. The native Windows 3.9.16 and 3.9.18 builds used in the CI experiment are downstream conda-forge builds. They are unlikely to differ in a way relevant to this, but it's not necessarily a perfect comparison.)
It might also be feasible to bisect using I'm unsure how straightforward bisection would be, because I don't know if all upstream commits that work for upstream builds will build or run correctly when built for Cygwin and with Cygwin patches applied. Even for releases, not all 3.9.z versions are packaged downstream for Cygwin. Although I recommend keeping the Cygwin python39 package pinned to 3.9.16-1 in GitPython's This is especially relevant to #1791, as it means that's probably already in the clear as far as this Cygwin issue is concerned, and nothing related to this has to be done there. |
Thanks again for sharing and for taking the issue to the point where it's effectively fixed for everyone until a permanent fix is discovered by the maintainers - amazing work! |
[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [GitPython](https://togithub.com/gitpython-developers/GitPython) | `==3.1.41` -> `==3.1.42` | [![age](https://developer.mend.io/api/mc/badges/age/pypi/GitPython/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/GitPython/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/GitPython/3.1.41/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/GitPython/3.1.41/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes <details> <summary>gitpython-developers/GitPython (GitPython)</summary> ### [`v3.1.42`](https://togithub.com/gitpython-developers/GitPython/releases/tag/3.1.42) [Compare Source](https://togithub.com/gitpython-developers/GitPython/compare/3.1.41...3.1.42) #### What's Changed - Fix release link in changelog by [@​PeterJCLaw](https://togithub.com/PeterJCLaw) in [https://github.com/gitpython-developers/GitPython/pull/1795](https://togithub.com/gitpython-developers/GitPython/pull/1795) - Remove test dependency on sumtypes library by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1798](https://togithub.com/gitpython-developers/GitPython/pull/1798) - Pin Sphinx plugins to compatible versions by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1803](https://togithub.com/gitpython-developers/GitPython/pull/1803) - fix: treeNotSorted issue by [@​et-repositories](https://togithub.com/et-repositories) in [https://github.com/gitpython-developers/GitPython/pull/1799](https://togithub.com/gitpython-developers/GitPython/pull/1799) - Remove git.util.NullHandler by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1807](https://togithub.com/gitpython-developers/GitPython/pull/1807) - Clarify why GIT_PYTHON_GIT_EXECUTABLE may be set on failure by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1810](https://togithub.com/gitpython-developers/GitPython/pull/1810) - Report actual attempted Git command when Git.refresh fails by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1812](https://togithub.com/gitpython-developers/GitPython/pull/1812) - Don't suppress messages when logging is not configured by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1813](https://togithub.com/gitpython-developers/GitPython/pull/1813) - Pin Python 3.9.16 on Cygwin CI by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1814](https://togithub.com/gitpython-developers/GitPython/pull/1814) - Have initial refresh use a logger to warn by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1815](https://togithub.com/gitpython-developers/GitPython/pull/1815) - Omit warning prefix in "Bad git executable" message by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1816](https://togithub.com/gitpython-developers/GitPython/pull/1816) - Test with M1 macOS CI runner by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1817](https://togithub.com/gitpython-developers/GitPython/pull/1817) - Bump pre-commit/action from 3.0.0 to 3.0.1 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/gitpython-developers/GitPython/pull/1818](https://togithub.com/gitpython-developers/GitPython/pull/1818) - Bump Vampire/setup-wsl from 2.0.2 to 3.0.0 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/gitpython-developers/GitPython/pull/1819](https://togithub.com/gitpython-developers/GitPython/pull/1819) - Remove deprecated section in README.md by [@​marcm-ml](https://togithub.com/marcm-ml) in [https://github.com/gitpython-developers/GitPython/pull/1823](https://togithub.com/gitpython-developers/GitPython/pull/1823) - Keep temp files out of project dir and improve cleanup by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1825](https://togithub.com/gitpython-developers/GitPython/pull/1825) #### New Contributors - [@​PeterJCLaw](https://togithub.com/PeterJCLaw) made their first contribution in [https://github.com/gitpython-developers/GitPython/pull/1795](https://togithub.com/gitpython-developers/GitPython/pull/1795) - [@​et-repositories](https://togithub.com/et-repositories) made their first contribution in [https://github.com/gitpython-developers/GitPython/pull/1799](https://togithub.com/gitpython-developers/GitPython/pull/1799) - [@​marcm-ml](https://togithub.com/marcm-ml) made their first contribution in [https://github.com/gitpython-developers/GitPython/pull/1823](https://togithub.com/gitpython-developers/GitPython/pull/1823) **Full Changelog**: gitpython-developers/GitPython@3.1.41...3.1.42 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/lettuce-financial/github-bot-signed-commit). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xNzMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjE3My4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [GitPython](https://togithub.com/gitpython-developers/GitPython) | `==3.1.41` -> `==3.1.42` | [![age](https://developer.mend.io/api/mc/badges/age/pypi/GitPython/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/GitPython/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/GitPython/3.1.41/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/GitPython/3.1.41/3.1.42?slim=true)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes <details> <summary>gitpython-developers/GitPython (GitPython)</summary> ### [`v3.1.42`](https://togithub.com/gitpython-developers/GitPython/releases/tag/3.1.42) [Compare Source](https://togithub.com/gitpython-developers/GitPython/compare/3.1.41...3.1.42) #### What's Changed - Fix release link in changelog by [@​PeterJCLaw](https://togithub.com/PeterJCLaw) in [https://github.com/gitpython-developers/GitPython/pull/1795](https://togithub.com/gitpython-developers/GitPython/pull/1795) - Remove test dependency on sumtypes library by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1798](https://togithub.com/gitpython-developers/GitPython/pull/1798) - Pin Sphinx plugins to compatible versions by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1803](https://togithub.com/gitpython-developers/GitPython/pull/1803) - fix: treeNotSorted issue by [@​et-repositories](https://togithub.com/et-repositories) in [https://github.com/gitpython-developers/GitPython/pull/1799](https://togithub.com/gitpython-developers/GitPython/pull/1799) - Remove git.util.NullHandler by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1807](https://togithub.com/gitpython-developers/GitPython/pull/1807) - Clarify why GIT_PYTHON_GIT_EXECUTABLE may be set on failure by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1810](https://togithub.com/gitpython-developers/GitPython/pull/1810) - Report actual attempted Git command when Git.refresh fails by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1812](https://togithub.com/gitpython-developers/GitPython/pull/1812) - Don't suppress messages when logging is not configured by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1813](https://togithub.com/gitpython-developers/GitPython/pull/1813) - Pin Python 3.9.16 on Cygwin CI by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1814](https://togithub.com/gitpython-developers/GitPython/pull/1814) - Have initial refresh use a logger to warn by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1815](https://togithub.com/gitpython-developers/GitPython/pull/1815) - Omit warning prefix in "Bad git executable" message by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1816](https://togithub.com/gitpython-developers/GitPython/pull/1816) - Test with M1 macOS CI runner by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1817](https://togithub.com/gitpython-developers/GitPython/pull/1817) - Bump pre-commit/action from 3.0.0 to 3.0.1 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/gitpython-developers/GitPython/pull/1818](https://togithub.com/gitpython-developers/GitPython/pull/1818) - Bump Vampire/setup-wsl from 2.0.2 to 3.0.0 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/gitpython-developers/GitPython/pull/1819](https://togithub.com/gitpython-developers/GitPython/pull/1819) - Remove deprecated section in README.md by [@​marcm-ml](https://togithub.com/marcm-ml) in [https://github.com/gitpython-developers/GitPython/pull/1823](https://togithub.com/gitpython-developers/GitPython/pull/1823) - Keep temp files out of project dir and improve cleanup by [@​EliahKagan](https://togithub.com/EliahKagan) in [https://github.com/gitpython-developers/GitPython/pull/1825](https://togithub.com/gitpython-developers/GitPython/pull/1825) #### New Contributors - [@​PeterJCLaw](https://togithub.com/PeterJCLaw) made their first contribution in [https://github.com/gitpython-developers/GitPython/pull/1795](https://togithub.com/gitpython-developers/GitPython/pull/1795) - [@​et-repositories](https://togithub.com/et-repositories) made their first contribution in [https://github.com/gitpython-developers/GitPython/pull/1799](https://togithub.com/gitpython-developers/GitPython/pull/1799) - [@​marcm-ml](https://togithub.com/marcm-ml) made their first contribution in [https://github.com/gitpython-developers/GitPython/pull/1823](https://togithub.com/gitpython-developers/GitPython/pull/1823) **Full Changelog**: gitpython-developers/GitPython@3.1.41...3.1.42 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/allenporter/flux-local). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xNzMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjE3My4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
The bug
The latest currently packaged version of Python 3.9 for Cygwin is 3.9.18 (provided by the Cygwin package python39 at version 3.9.18-1). That version, at least as we are using it, has a problem where
pip
stalls indefinitely on some PyPI package downloads. This is the problem encountered around the time #1813 was reviewed and merged, and discussed in comments there, but it was not triggered by that or any other change in GitPython.Details
In #1813 (comment) I said I couldn't produce it locally, but since then I have been able to do so. I believe the reason I couldn't produce it locally at that time was that I had not cleared cached
pip
downloads in~/.cache
. I am unsure of the exact condition under which the problem occurs, but it is possibly when a single run ofpip
must download two packages; the first downloads, and the second stalls forever.I have in some places, including some commit messages, described this as "blocking," but that was a guess, and a wrong one. When produced locally, the Cygwin
python3.9
process uses a full CPU core until terminated (so it is "spinning" rather than "blocking"). I'm willing to rebase commits on request to correct that, but I lean toward not doing so for reasons of efficiency, in part because CI will run again on them. In Cygwin with thekill
command,SIGTERM
does not terminate it;SIGKILL
does not seem to either, or at least not reliably. Right-clicking on the process in the "Details" tab of the Task Manager always terminates it immediately. Once terminated,pip
can be attempted again, and it will succeed in downloading what it had just failed to download, but then fail on the next download if there is another one (which there usually is).This happens whether or not I upgrade the PyPA packages (
pip
,setuptools
, andwheel
). It sometimes happens forwheel
when I do, which is how a number of the CI runs have failed. When it doesn't, thatpip
run succeeds. Whether or not PyPA packages were upgraded, the subsequent run ofpip install -e ".[test]"
, which installs many more packages, always encounters the problem. More specifically, it used to happen with thecoverage
package, but a few hours ago started usually happening withpytest
instead, which I believe is just due to a different order in which packages are downloaded, triggered by the very recent release ofpytest
8. Ifpip
is terminated and rerun, it will happen on every second package, as far as I have observed.This happens with multiple versions of the python39-pip Cygwin package, as well as with multiple versions of the PyPI pip package if requested explicitly. Therefore, while the problem happens when
pip
runs, I don't believe it is due to a bug inpip
or even a bug in Cygwin's python39-pip package. It also does not seem to be affected by whetherpip
is run withpip
orpython3.9 -m pip
, nor by whether or notpip
is used in a virtual environment. It happens with multiple versions of thecygwin
Cygwin package which providescygwin1.dll
. It can happen when attempting to download a source package or a wheel. It only happens on Cygwin, not native Windows or other platforms.Examining the CI logs for commits from before and after the problem began reveals that the problem did not occur when the python39 Cygwin package was at version 3.9.16-1, and always occurred once it was at version 3.9.18-1. For example, 9b7e15f used 3.9.16-1 while 987dbf4 used 3.8.18-1. Specific versions were not requested, so typically the latest stable versions of Cygwin packages are installed, and it was on 26 January that python39 version 3.9.18-1 was promoted to stable. This does not occur with Python 3.8.
The workaround
Pinning 3.9.16-1
This pull request downgrades Python on Cygwin to the latest available patch version of 3.9 packaged for Cygwin of those that strictly precede 3.9.18 where the problem occurs. That version is 3.9.16, provided by the Cygwin package python39 at version 3.9.16-1.
This version may eventually no longer be available for download from Cygwin's repositories, so hopefully a real solution or better workaround will be found by then, or perhaps a future update to the package itself will fix the problem. Because 3.8 works, an existing backup workaround is to downgrade even further to 3.8.
Switching actions to facilitate pinning
Although GitHub code search finds repositories where the official cygwin/install-cygwin-action GitHub Action was used, or an attempt made to use it, with the
package=version
syntax for specifying Cygwin packages to be installed, that does not appear working. It doesn't report an error when I attempt it, but the version number seems always to be ignored. So in order to pin python39 at 3.9.16-1, I also switched to using egor-tensin/setup-cygwin. The dependency graph feature in GitHub reports them as being about equally popular (official, unofficial).However, I removed
add-to-path: false
; it seems egor-tensin/setup-cygwin doesn't have such a feature. So it may be worthwhile to switch back once we can, to use that again. The main benefit ofadd-to-path: false
is clarity about the environment from which we are using Cygwin facilities, but it can also help with cleanup/finalization performed by actions that that are run before Cygwin is installed: actions/checkout issues a warning and seems like it may not be cleaning up fully, due to attempting to use the Cygwingit
for cleanup. I think it is okay for now because the GitHub hosted runners are virtual machines that get deleted and recreated each time; only self-hosted runners are (potentially) reused. But if we have to keep using it for a long time then that should be fixed.Other things I tried
I tried a bunch of other things while investigating this, as well as trying a few variations on the downgrades (corresponding to some of the things I said made no difference above). If each of my original commits were its own commit in this pull request, the pull request would have more than 40 commits. Although this may not be inherently excessive, it seems to me that it was better to squash them down into two commits, represented the two actually useful workarounds I found (downgrading to 3.8, and downgrading only to 3.9.16) done in what seemed like the best of the ways I tried.
However, for future investigation, I preserved the full history by using GitHub itself to perform the squashes, from two fork-internal PRs, EliahKagan#2 and EliahKagan#3, which are linked and explained in the two commits. (I had to amend the second after squashing to fix a mistake in its title, which is why its hash differs from the hash the PR shows as merged.) The individual original commits can be examined in those PRs, though I don't believe it is at all necessary to look at that to review this PR (if I did, I wouldn't have squashed them). Note that some of those individual commits might be misleading individually because the changes did not always achieve what the messages described; in particular, attempts to pin packages before switching from cygwin/cygwin-install-action to egor-tensin/setup-cygwin did not actually do any pinning.