Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask git where its daemon is and use that #1697

Merged
merged 1 commit into from
Oct 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,9 +121,6 @@ _Important_: Right after cloning this repository, please be sure to have execute
the `./init-tests-after-clone.sh` script in the repository root. Otherwise
you will encounter test failures.

On _Windows_, make sure you have `git-daemon` in your PATH. For MINGW-git, the `git-daemon.exe`
exists in `Git\mingw64\libexec\git-core\`.

#### Install test dependencies

Ensure testing libraries are installed. This is taken care of already if you installed with:
Expand Down
2 changes: 1 addition & 1 deletion test/lib/helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ def git_daemon_launched(base_path, ip, port):
# and then CANNOT DIE!
# So, invoke it as a single command.
daemon_cmd = [
"git-daemon",
osp.join(Git()._call_process("--exec-path"), "git-daemon"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uneasy about having the test suite call the nonpublic _call_process method to do this. I want it to use the same git as GitPython uses, including the effect of the GIT_PYTHON_GIT_EXECUTABLE (as well as any future nuances that might ever arise) automatically, which Git().execute(["git", "--exec-path"]) would not do. If the command were git exec-path or git something --exec-path, then I think Git().exec_path() or Git.something(exec_path=True), respectively, could be used. But for a git command that has no subcommand and just passes an option, I don't know of a way to use GitPython's public interface to run it. It may be that I'm just missing something obvious here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's definitely a shortcoming in the Git class' API, it does always assume a sub-command. This also makes it impossible to set configuration overrides, for instance, so finding a solution for this will have immediate benefits, and it would definitely be welcome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main hard part of adding such functionality is figuring out a way to do it that wouldn't be a breaking change. Most ordinary public-style method names could clash with someone's custom git commands (such as scripts named like git-*), which GitPython is generally able to run and thus would begin to break after such a change. The most intuitive names for this, like invoke, would be especially likely to clash (I'm sure some people have a git-invoke script).

This also makes it impossible to set configuration overrides

Is this because only one -c can be passed by calling the Git instance with c=..., or for some other reason (or am I misunderstanding what you mean)? To use an example inspired by check-version.sh, and with g as a git.cmd.Git instance, I can cause -c versionsort.suffix=-pre to be passed, and in the correct position, with:

g(c="versionsort.suffix=-pre").tag(sort="-v:refname")

That runs git -c versionsort.suffix=-pre tag --sort=-v:refname as desired, with -c versionsort.suffix=-pre before the subcommand name and --sort=-v:refname following it.

However, I can't pass more than one -c that way, because a single call can't pass the same keyword argument multiple times, and the preceding arguments are discarded with multiple calls, i.e., these do the same thing:

g(c="versionsort.suffix=-pre")(c="versionsort.suffix=-RC").tag(sort="-v:refname")
g(c="versionsort.suffix=-RC").tag(sort="-v:refname")

But I'm not sure this is the problem you're thinking of, because a solution for passing -c arguments and their operands, or for passing arbitrary arguments before a subcommand, would not necessarily facilitate running git without a subcommand. Nor would a solution for running git without a subcommand necessarily allow a subcommand to be added in a user-friendly way supporting the keyword argument syntax for specifying the subcommand's own flags.

finding a solution for this will have immediate benefits

Can people just use _call_process?

For having GitPython run git with arbitrarily specified arguments, the nonpublic _call_process method does that. Does its behavior differ from the desired behavior for doing so?

If not, then that method could be made public simply by documenting it as public, which would avoid breaking any custom git commands, because (a) it wouldn't change the actual behavior of GitPython at all, and (b) GitPython already doesn't support custom git commands that start with _, and git itself doesn't support custom commands that start with - (since an attempt to invoke such a command would pass one or more options instead).

An example of where an attribute with a leading _ that is made public by documenting it as public, for the same reasons as we might want to do so here--that any other name might clash--is how types constructed with the collections.namedtuple factory have public _make, _asdict, _replace, _fields, and _field_defaults attributes. (In contrast, although the _thread module is public, this is not really an example of this, because it is not named that way for a similar reason.)

On the other hand, there may be some reasons not to make _call_process public by declaring it so. The interface for collections.namedtuple is simpler than for git.cmd.Git, and also more widely known about because it is part of the standard library, so deviations from common naming conventions may be more discoverable. Also, intuitively, even if _call_process were public, its name suggests that its use from outside GitPython's own code would be rarer than execute. But using a Git object to run a non-git command should be rare, so if _call_process is public then it should be used more often than execute.

Making a "submethod" to run git with literal arguments

One possibility, again where g is a Git instance, could be to allow g.execute.git(*args), accepting zero or more separate positional arguments in place of *args that GitPython would immediately run git with. I find this intuitive, and it could be achieved by making the execute method a custom descriptor that works like a bound method, except that it also causes g.execute.git to resolve to g._call_process, and Git.execute.git to resolve to Git._call_process (so it also works explicitly pass g to the unbound form, as methods are expected to support).

But the problem with this is that it is not obvious whether the "submethod" ought to continue being usable when a class that derives from Git overrides execute. Secondarily, I think having overrides turn into descriptors that also support .git would be complicated, and might go against assumptions people make about he effect of writing a subclass.

To be clear, the problem is not that overriding execute affects the behavior. That is already the case with _call_process and everything that uses it, and is probably the main reason for a subclass of Git to override execute. Rather, the question is whether MyGit().execute.git(*args) and MyGit().execute.git(my_g, *args) should work and, if so, whether the complexity to make it work is justified.

Other ways, which also don't seem ideal

Other possibilities include:

  • Naming the method a single underscore: g._(*args). This seems unintuitive.
  • Versioning the interface, so something has to be passed when a Git object is constructed to enable new methods.
  • Keeping the Git class the same but providing a derived class of Git that includes new methods.
  • Using a top-level function that receives the Git object as its first argument.
  • Using a top-level function that does not use the Git object.
  • Picking some name people probably are not using as a custom git command (but the more reliably they are not, the less intuitive the command is, probably).
  • Not adding a feature for this, but adding a convenient way to get the git command (relative or absolute path) that _call_process passes to execute, and noting how to use execute with it in execute's docstring, elsewhere in the documentation, or both.

A hack that shouldn't be used

By the way, it turns out there actually is a way I could have used the "public" interface to achieve the effect of g._call_process("--exec-path"). Because git accepts a -- after this option with no change in behavior, we can fool GitPython into thinking -- is the subcommand. Where again g is a Git instance:

>>> getattr(g(exec_path=True), "--")()
'C:/Users/ek/scoop/apps/git/2.42.0.2/mingw64/libexec/git-core'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I can't pass more than one -c that way, because a single call can't pass the same keyword argument multiple times, and the preceding arguments are discarded with multiple calls

That's interesting! I wasn't even aware this works. Also I don't know how this interacts with the maybe more typical usage of repo.git.subcommand(), or is something like repo.git(c="foo=bar").subcommand() possible?

Regarding the multi-issue, maybe it already works like this?

g(c=["versionsort.suffix=-pre", "other=baz"]).tag(sort="-v:refname")

To be clear, the problem is not that overriding execute affects the behavior. That is already the case with _call_process and everything that uses it, and is probably the main reason for a subclass of Git to override execute. Rather, the question is whether MyGit().execute.git(*args) and MyGit().execute.git(my_g, *args) should work and, if so, whether the complexity to make it work is justified.

Do you think it's common to subclass Git? I'd argue that this was never intended and I'd rather forbid it than think about it. And if it can't be prohibited officially, maybe it's possible to document it as "unsupported" which allows subclasses to break if they happen. Of course, that itself would be a breaking change, but I wonder anyone would notice.

Also, apologies in advance if what I say here doesn't make much sense or seems to ignore something you already mentioned - I am quite ignorant as to how Git (the class) is truly working and I really don't know what's best.

But simply making _call_process public officially seemed like the easiest while safe-enough way to go to me.

PS: >>> getattr(g(exec_path=True), "--")() is wonderfully creative :D.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting! I wasn't even aware this works. Also I don't know how this interacts with the maybe more typical usage of repo.git.subcommand(), or is something like repo.git(c="foo=bar").subcommand() possible?

g(c=["versionsort.suffix=-pre", "other=baz"]).tag(sort="-v:refname")

That does work! This treated both pre and RC1 as lower versions than their corresponding stable versions:

g(c=["versionsort.suffix=-pre", "versionsort.suffix=-RC"]).tag(sort="-v:refname")

It does also work with repo.git(...).subcommand(...):

>>> import logging
>>> logging.basicConfig(level=logging.INFO)
>>> import git
>>> git.Git.GIT_PYTHON_TRACE = True
>>> r = git.Repo("../Flood")
>>> r.git(c=["versionsort.suffix=-pre", "versionsort.suffix=-RC"]).tag(sort="-v:refname")
INFO:git.cmd:git -c versionsort.suffix=-pre -c versionsort.suffix=-RC tag --sort=-v:refname
'unletterspaced\nline-height\nletterspaced\nalpha-6\nalpha-5\nalpha-4\nalpha-3\nalpha-2\nalpha-1\nalpha-0'

(The repo I tested on doesn't have tags whose order is affected by versionsort, but the debugging output shows that both -c ... are passed and in the correct positions.)

Do you think it's common to subclass Git?

I'm not sure, but it may be possible to effectively search at least code on GitHub and elsewhere where rich code searching is implemented to figure it out. For a lot of stuff using popular projects like GitPython, I find such searching hard, because one gets lots of code in forks and vendored copies of the project. But if GitPython is never itself inheriting from Git or testing for that, it may be reasonably easy to find that sort of thing.

If I figure anything out about that, I'll let you know. I would intuitively expect to be able to inherit from it.

But simply making _call_process public officially seemed like the easiest while safe-enough way to go to me.

Yes. If that's acceptable, then I think it should be strongly considered before doing anything more complicated that also expands the GitPython public interface. A further argument for preferring this to other approaches is that it is already referenced in some public methods' docstrings.

PS: >>> getattr(g(exec_path=True), "--")() is wonderfully creative :D.

:) I guess there's an interesting question about whether the -- "attribute" of Git instances should be considered public on the grounds that its name does not start with an underscore. 😺

Actually, I had meant to be joking, just then, but it checks out:

ek@Kip:~$ cat x.py
globals()["--"] = "Hello, world!"
ek@Kip:~$ python3.11 -c 'from x import *; print(globals()["--"])'
Hello, world!
ek@Kip:~$ cat y.py
globals()["_--"] = "Hello, world!"
ek@Kip:~$ python3.11 -c 'from y import *; print(globals()["_--"])'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
KeyError: '_--'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I figure anything out about that, I'll let you know. I would intuitively expect to be able to inherit from it.

I am definitely happy to make this a non-feature, or at least document that subclass behaviour might change unannounced.

"--enable=receive-pack",
"--listen=%s" % ip,
"--port=%s" % port,
Expand Down
Loading