lint: add typos check #1888

Borda · 2024-03-31T20:32:05Z

Just a suggestion to add a check for typos and maybe let's fix some without breaking API

test/fixtures/commit_with_gpgsig

README.md

EliahKagan · 2024-03-31T21:50:16Z

I've noticed a considerable number of areas in the diff where correct names are made incorrect ("rela" stands for "relative" and I don't think there are any occurrences where it should be changed to "real", and there are some others). This is not limited to the GPG signature and project-name cases that you've identified. In addition, I'm not sure any changes should be made in files in test/fixtures that are used as test repository contents.

However, I've also noticed that you've marked this as a draft, and maybe you aware of the other issues. If you think it would be helpful for me to leave a review with comments on the individual problematic cases, I'd be pleased to do so. Otherwise I will assume as long as this is a draft that such a review might be more of a distraction than a help, and refrain from it.

There are also some areas where at least the fixes are clearly a huge improvement, particularly in test/test_index.py where it had not even occurred to me that, because of the way pytest marks work, I could misspell raises and accidentally write xfail marks that don't enforce specific exception types. At minimum, that should certainly be fixed. Hopefully the idea here will work out in some form, but even if not, I can't speak for you or for Byron but from my perspective the effort so far is worth it just for finding that.

EliahKagan · 2024-04-01T02:24:23Z

To make sure it is not lost track of, and also to report the results of some manual testing because the affected xfail markings cover some things not produced on CI, I've opened #1893 for the bug you've discovered in test_index.py here. This is the bug where three of the xfail markings pass misspelled keyword arguments that attempt unsuccessfully to cause the test to report an unexpected failure if the exception is not as declared. It differs from some of the other misspellings found here because it affects the behavior of the tests and can cause an unexpected failure to be wrongly reported as an expected failure.

Although those are definitely not the only typos found here that should be fixed, it seems to me that their elevated importance and relationship to the correctness of the tests justifies a separate PR to fix them, especially if such a PR would result in their being fixed sooner (and then they would no longer have to be worried about here).

If you are amenable to this idea, then I suggest opening that, as you deserve the credit for it. But I would be pleased to open that PR instead if you prefer (I would list you in the Co-authored-by trailer).

While another option may be to wait for the change to come in with this PR, I think it is better that it not be delayed while figuring out if and how automated spell checking can be added safely and with an acceptably low rate of false positives.

Byron · 2024-04-02T08:21:45Z

Thanks for sharing this draft, I am happy it could already find a genuine issue (#1893) despite a high rate of false positives.
My feeling here is that given that high amount and known issues in the underlying engine, trying to make this work beyond what's here won't be worth it. But that will be for @Borda to decide, and I'd appreciate such a decision so the PR won't stay open for too long.

Thank you

Borda · 2024-04-02T08:53:37Z

With my other projects, I have been using several typing tools, and this seems to be at first, lower effort, but as mentioned, it produces a significant number of false positives, and with the next version, there could be even more (just opened issues for crate-ci/typos#966 and crate-ci/typos#969)

So I'll open a separate PR for the fixes and most likely pivot this PR to use another typing alternative :)

Borda · 2024-05-07T17:32:37Z

pivoting to https://github.com/codespell-project/codespell

Borda · 2024-05-07T17:40:24Z

@EliahKagan @Byron, would you mind having a look at the updated version?
Not sure what to do about doesnt, which is used as a file name... 🐿️

Byron

Thanks a lot for making it happen!

Now it looks like the tool is usable, and it's nice to see that it caught a couple of real errors.

I will wait for @EliahKagan approval though before merging in case I am missing some more obscure aspects of the tool and as it's integrated into the tooling of GitPython.

git/objects/util.py

git/util.py

EliahKagan

I think the keyword argument spelling fixes in test/test_index.py justify adding automated spell-checking, even though various cases remain where spell-checking seems to have led to incorrect or sub-optimal changes.

These can be fixed, and the risk that spell-checking would lead to such cases being introduced later is, in my opinion, outweighed by the benefits of catching misspellings that, due to the dynamic nature of Python and its idiomatic uses, may affect the behavior of GitPython or its tests.

I've looked at each change and commented about the ones that I think should not be done or otherwise still need improvement. Some comments cover multiple changes, so the absence of a comment on a specific change does not mean that I think it is correct.

I recommend that this PR be marked as fixing #1893.

Edit: If Cygwin tests fail with "dubious ownership" errors when more commits are pushed to this pull request, that is not any fault of this PR, but also happens without the changes here. I've opened #1916 to fix it. If that pull request is merged, then merging from main or (perhaps better) rebasing this PR feature branch onto main should allow new Cygwin runs to pass here too.

EliahKagan · 2024-05-26T19:03:33Z

git/index/base.py

@@ -439,9 +439,9 @@ def raise_exc(e: Exception) -> NoReturn:
            # END glob handling
            try:
                for root, _dirs, files in os.walk(abs_path, onerror=raise_exc):
-                    for rela_file in files:
+                    for relative_fpath in files:


Why is the name relative_fpath better than rela_file? Why was this chosen instead of relative_path? If the f in fpath is unimportant, then it should be removed. If it is important, then it should be spelled out. If explicitness is not required, then presumably rela_file is also okay, in which case it should not be changed just to make the spell checker happy.

This applies to most occurrences of relative_fpath, including in other files.

My guess is that this should be relative_path. The nonpublic _items_to_rela_paths method was renamed to _items_to_relative_paths. Assuming that change is good, which I think it is, it seems like relative_fpath should just be relative_path.

rela was marked as typo so I found easier to use full name without affecting API

I don't think relative_fpath is a full name. It looks like a typo that is meant to be relative_path without the f.

Maybe it is not a typo. Maybe f is an abbreviation for something that should be spelled out (if important) or omitted (if unimportant).

The key point is that I do not know what relative_fpath means in the places where this PR has introduced it, and I have not been able to figure that out. (I have been able to guess that the f stands for "file," but I am not certain of this, and without knowing the old variable name rela_file, I would likely not even have been able to guess this.) I expect that other current or future readers may also not know what it means.

I recommend changing it, probably to relative_path.

git/objects/util.py

git/refs/symbolic.py

EliahKagan · 2024-05-26T19:29:21Z

git/remote.py

-            # uptodate encoded in control character
+            # up-to-date encoded in control character


This change may seem at first glance to be obviously correct, but I think it actually may be wrong, and that if it is to be kept then it requires a specific technical justification.

I think uptodate is a specific technical term in Git. In the Git source code, it often appears capitalized, but it also appears lower-case in multiple places, which also seems to be intentional. As one example, in fetch.c:

/* uptodate lines are only shown on high verbosity level */ if (verbosity <= 0 && oideq(&ref->peer_ref->old_oid, &ref->old_oid)) continue;

It seems like that specific technical meaning is the one relevant here. If this has been verified not to be the case, then the change here is okay. Otherwise, either the change should be undone and uptodate added as a correct spelling, or it should be investigated.

Although this feels minor, making technical terms harder to search for can accumulate and make a codebase difficult to work with. That is both potentially relevant to this specific change, and a potential risk of automated spell-checking.

I think in this case making it a command associate would be better

EliahKagan · 2024-05-26T19:52:05Z

pyproject.toml

@@ -79,3 +79,9 @@ lint.unfixable = [
 "test/**" = [
    "B018",  # useless-expression
 ]
+
+[tool.codespell]
+skip = 'test/fixtures/reflog_*'


I don't think any of the files in fixtures that represent test input or expected output should be spell-checked. I think the *.py files in fixtures, which are actually run as code, should be spell-checked, and that other files should not.

It seems to me that the question to ask is, if code appearing inside a fixture were found to have a logic error, should that bug be fixed? A number of fixture files have Ruby code or diffs thereof, but these are just test data. If logic errors in that code (which isn't run) shouldn't be fixed, then either the same files should not be spell-checked, or the justification for spell-checking them should be made clear. The issues with these kinds of changes are:

Churn in test data may make it so that changes to test data that are actually done to improve the tests are hard to identify.

Changes in test data need to be reviewed to evaluate whether they could have any impact on the tests. It is possible, in general, for a change to test data to keep a test passing, while preventing it from catching regressions that it would have caught before the test data changed.

The first concern is minor and may well be overcome by the slight readability improvement of avoiding typos. The second concern is less minor and it seems to me that this is not worth the risk, even if small. Tests can assert things that are affected by the presence or absence of specific strings or that involve specific lengths.

This also applies to all changes in test/fixutres/diff_mode_only. I have not posted separate comments there.

Edit: See also #1920 (review).

yes all sounds reasonable to me, so how about splitting this into two PRs?

add typos check with exclude fixtures

revisit fixtures' typos and eventually remove ignoring this folder from typo's check

@EliahKagan Did you see this message?

This seems like a good approach to me.

pyproject.toml

test/test_refs.py

Byron

I am now officially setting this PR to a state that indicates that some modifications are needed.

git/remote.py

Borda · 2024-07-17T10:34:56Z

test/test_exc.py

@@ -52,7 +52,7 @@

 _streams_n_substrings = (
    None,
-    "steram",
+    "stream",


this is probably linked to fixtures / test data

Borda · 2024-07-17T10:41:34Z

@EliahKagan @Byron reverted most of my additional changes so keep it just with adding check and fixing all flagged issues, also excluding fixtures...

Byron

Thanks a lot for the minification of the PR, it looks good to me!

EliahKagan

Thanks!

Some of the CI tests use WSL. This switches the WSL distribution from Debian to Alpine, which might be slightly faster. For the way it is being used here, the main expected speed improvement would be to how long the image would take to download, as Alpine is smaller. (The reason for this is thus unrelated to the reason for the Alpine docker CI test job added in gitpython-developers#1826. There, the goal was to test on a wider variety of systems and environments, and that runs the whole test suite in Alpine. This just changes the WSL distro, used by a few tests on Windows, from Debian to Alpine.) Two things have changed that, taken together, have unblocked this: - Vampire/setup-wsl#50 was fixed, so the action we are using is able to install Alpine Linux. See: gitpython-developers#1917 (review) - gitpython-developers#1893 was fixed in gitpython-developers#1888. So if switching the WSL distro from Debian to Alpine breaks any tests, including by making them fail in an unexpected way that raises the wrong exception, we are likely to find out.

Borda commented Mar 31, 2024

View reviewed changes

test/fixtures/commit_with_gpgsig Outdated Show resolved Hide resolved

Borda commented Mar 31, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

Borda marked this pull request as draft March 31, 2024 20:47

This was referenced Mar 31, 2024

Don't suppress pytest warning summaries #1892

Merged

Some xfail markings fail to validate their exception types #1893

Closed

Borda marked this pull request as ready for review May 7, 2024 17:40

Byron approved these changes May 7, 2024

View reviewed changes

git/objects/util.py Outdated Show resolved Hide resolved

git/util.py Outdated Show resolved Hide resolved

EliahKagan suggested changes May 26, 2024

View reviewed changes

Borda force-pushed the precommit/typos branch from 0451fdd to abd9683 Compare May 30, 2024 10:21

This was referenced May 30, 2024

precommit: enable end-of-file-fixer #1920

Merged

precommit: enable validate-pyproject #1921

Merged

Byron requested changes May 31, 2024

View reviewed changes

Borda added 2 commits July 17, 2024 12:17

use codespell

1c88b0a

fix & skip

2ce013c

Borda force-pushed the precommit/typos branch from b1aa63d to 2ce013c Compare July 17, 2024 10:20

fixing

93993b2

Borda commented Jul 17, 2024

View reviewed changes

git/remote.py Outdated Show resolved Hide resolved

Apply suggestions from code review

813520c

Borda commented Jul 17, 2024

View reviewed changes

Borda requested review from EliahKagan and Byron July 17, 2024 10:40

Byron approved these changes Jul 17, 2024

View reviewed changes

Byron merged commit 89822f8 into gitpython-developers:main Jul 17, 2024
26 checks passed

EliahKagan reviewed Jul 17, 2024

View reviewed changes

Borda deleted the precommit/typos branch July 17, 2024 17:30

EliahKagan mentioned this pull request Jul 24, 2024

Use Alpine Linux in WSL on CI #1945

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lint: add typos check #1888

lint: add typos check #1888

Borda commented Mar 31, 2024

EliahKagan commented Mar 31, 2024 •

edited

Loading

EliahKagan commented Apr 1, 2024 •

edited

Loading

Byron commented Apr 2, 2024

Borda commented Apr 2, 2024

Borda commented May 7, 2024

Borda commented May 7, 2024

Byron left a comment

EliahKagan left a comment •

edited

Loading

EliahKagan May 26, 2024

Borda May 30, 2024

EliahKagan May 30, 2024

EliahKagan May 26, 2024 •

edited

Loading

Borda Jul 17, 2024

EliahKagan May 26, 2024 •

edited

Loading

Borda Jul 16, 2024

Byron Jul 17, 2024

EliahKagan Jul 17, 2024

Byron left a comment

Borda Jul 17, 2024

Borda commented Jul 17, 2024

Byron left a comment

EliahKagan left a comment

		# uptodate encoded in control character
		# up-to-date encoded in control character

lint: add typos check #1888

lint: add typos check #1888

Conversation

Borda commented Mar 31, 2024

EliahKagan commented Mar 31, 2024 • edited Loading

EliahKagan commented Apr 1, 2024 • edited Loading

Byron commented Apr 2, 2024

Borda commented Apr 2, 2024

Borda commented May 7, 2024

Borda commented May 7, 2024

Byron left a comment

Choose a reason for hiding this comment

EliahKagan left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EliahKagan May 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EliahKagan May 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Byron left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Borda commented Jul 17, 2024

Byron left a comment

Choose a reason for hiding this comment

EliahKagan left a comment

Choose a reason for hiding this comment

EliahKagan commented Mar 31, 2024 •

edited

Loading

EliahKagan commented Apr 1, 2024 •

edited

Loading

EliahKagan left a comment •

edited

Loading

EliahKagan May 26, 2024 •

edited

Loading

EliahKagan May 26, 2024 •

edited

Loading