Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OO refactoring and git-blame preloading #12

Merged
merged 28 commits into from
Mar 4, 2020

Conversation

uliska
Copy link
Contributor

@uliska uliska commented Mar 2, 2020

This has become more involved and intrusive than I'd prefer at this point of contribution to a repository, and I'm feeling somewhat uneasy about it. However, it became an all-or-nothing thing and wouldn't have made sense to send slice by slice.

This PR does more than one thing unforfunately:

  • Replace the GitPython dependency with an object oriented abstraction with classes
    • GitCommand
    • Repo
    • Commit
    • Page
    • Author
  • Preload the git-blame statistics in the on_files event

I think this restructuring will make it easier to add more functionality and is therefore useful.

It does not include a proposal for #9 yet although that should be trivial.

This commit completely breaks the unit tests, but I didn't want to put my hands on these before getting some feedback on the PR itself.

Create a Git repository abstraction with just what we need.

This will be continued with a Book class holding/storing the info
about the whole book, its size and its authors.
Have dedicated objects:
- Repo
- Author
- Page
- Commit
handle all the data and responsibilities.
This commit creates the Page objects within the on_files
MkDocs event.

As a result the lines/contribution statistics for the whole book
are now available from within *every* Markdown page.
# Conflicts:
#	mkdocs_git_authors_plugin/plugin.py
#	mkdocs_git_authors_plugin/util.py
The 00000 "commit" shown by git blame for uncommitted lines
caused the plugin to crash when trying to execute "git show" on it.
Therefore uncommitted stuff is attributed to a fake author whose
display characteristics are configurable.
Inserting {{ git_authors_list }} in any Markdown file
will insert a list of all authors along with their contribution
to the whole site.

Adds configuration options:
- show_lines (default: false)
  also show the line count in the list
  (but not on a page's summary)
- label_lines (for localization)

TODO: Provide *real* configurability of the resulting HTML
@uliska
Copy link
Contributor Author

uliska commented Mar 3, 2020

I've now added th {{ git_authors_list }} feature proposed in #9.

### Mitwirkende

Folgende Autoren haben an der Erstellung dieser Dokumentation mitgewirkt:

{{ git_authors_list }}

results in the following output:

grafik

with the following configuration:

- git-authors:
    show_contribution: true
    show_lines: true
    label_lines: Zeilen
    uncommitted_name: Nicht gesichert

The "Nicht gesichert" refers to just this example which hasn't been committed.

I think I would like to see an option labels, which allows more labels to be added in a more structured way.

[This of course still breaks all unit tests.]

Adds configuration options
- sort_by (choice: 'name', 'contribution')
- sort_reverse (bool, default: False)
@uliska
Copy link
Contributor Author

uliska commented Mar 3, 2020

With the sort order I think I'm basically done with everything I'd like to have added to the plugin.

If you basically agree with the restructuring in this PR

  • I'll take care of adding unit tests (actually I'd be glad if you could look after the existing ones, @timvink)
  • I'll add the relevant documentation

I have a final suggestion for you, if you agree with my approach to removing GitPython support: I would merge this (as a PyPi package) with the mkdocs-git-revision-date(-localized) packages, but still expose them as separate MkDocs plugins. So all packages can share the basic repo module and its functionality.

There may be users not interested in counting empty lines
as content.
@uliska
Copy link
Contributor Author

uliska commented Mar 3, 2020

Sorry, yet another one ;-)

@timvink
Copy link
Owner

timvink commented Mar 3, 2020

Wow, I can see you definitely went for it ! Starting to enjoy the open source work @uliska ?

This is a huge PR, which also means I have a lot of comments. I focussed especially on maintainability and simplicity. In general I think it's a great PR that will really help with future additions as well. I realise it addresses your specific needs, but it still needs more work to get to a place where it is very usably for a large group of people.

I already mentioned them in the specific review comments, but main review points are:

  • Reduce verbosity, some methods not used or necessary
  • I am much more concerned about stability than adding an extra 850kb dependency on a very matureGitPython. Let's seriously consider removing GitCommand and reintroduce GitPython.
  • Separate concerns. Keep the git stuff in the git classes, and the (html) summary stuff in util.py
  • I had to be a bit strict on some of the extra options and the extra markdown tag. Hope you understand. See motivation in the review comments.
  • I have been more explicit on the 'roadmap' I envision for this plugin, see Roadmap / Vision [status] #16

I have a final suggestion for you, if you agree with my approach to removing GitPython support: I would merge this (as a PyPi package) with the mkdocs-git-revision-date(-localized) packages, but still expose them as separate MkDocs plugins. So all packages can share the basic repo module and its functionality.

Thanks for the suggestion. I'd prefer to keep depending on GitPython for mkdocs-git-revision-date-localized. Copying the repo.py will mean more complex maintainability. Inevitably there are going to be specific bugs with new code.. and I'm already noticing some copy/paste troubles between the packages. You did some good work with the Git abstractions however. I would be open to depending on a separate new python package. There might be more demand by package authors to use a very simple and very lightweight Git wrapper (GitPython docs are too verbose and a bit unclear to be honest). Something for you to consider :)

Before working on adding docs and unit tests, let's address the review comments first OK ?

@uliska
Copy link
Contributor Author

uliska commented Mar 3, 2020

Starting to enjoy the open source work @uliska ?

Well, if you look at my profile you may realize that "starting" doesn't really pin it down ;-) (and that's only GitHub).

Rather I tend to become creative extending code with (generally) useful stuff - when there's something to build upon.

I realise it addresses your specific needs, but it still needs more work to get to a place where it is very usably for a large group of people.

I'm aware of that. But things will only move on when started in the first place.

I already mentioned them in the specific review comments,

I seem to be missing something, but I don't see any review comments.

Reduce verbosity, some methods not used or necessary

I assume I'll see what you mean once I see the review comments?

Let's seriously consider removing GitCommand and reintroduce GitPython.

I suggest leaving that question to itself for a few days to see what comes out of contemplating it.

You did some good work with the Git abstractions however. I would be open to depending on a separate new python package. There might be more demand by package authors to use a very simple and very lightweight Git wrapper (GitPython docs are too verbose and a bit unclear to be honest). Something for you to consider :)

I don't think converting my code into a standalone Git wrapper makes a lot of sense. The point of the exercise basically was to only implement what's actually needed for the "client" (the plugin). Once it is exposed to a "general audience" all kinds of requests would come, and it would end as a less-tested GitPython.

Copying the repo.py will mean more complex maintainability.

I will not push you on this, but maybe there's a misunderstanding here. I didn't talk about copying the code. I was thinking of one common Python package that includes two MkDocs plugins as exposed through different entry points in the setup.py file.

For today my time is completely taken up, but I'll consider your comments in more detail when I get the chance to see your individual review comments (maybe there's an option to finally publish a review?)

Copy link
Owner

@timvink timvink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now that I need to 'submit' the review!

Looking at your profile it seems you are much more experienced in OS than me for sure ;) What I meant was that it looks like you are starting to enjoy the work on this plugin. For me at least I find it very cool to make tools that others use! Hopefully I'll learn a bit from you as we go through the review comments.

And yes, I did misunderstand your point on merging with mkdocs-revision-date-plugin. Actually it makes sense, as there are many common components. I need to think about this, and I might open a new issue to discuss / document.

mkdocs_git_authors_plugin/plugin.py Show resolved Hide resolved
mkdocs_git_authors_plugin/repo.py Outdated Show resolved Hide resolved
mkdocs_git_authors_plugin/plugin.py Outdated Show resolved Hide resolved
mkdocs_git_authors_plugin/repo.py Outdated Show resolved Hide resolved
mkdocs_git_authors_plugin/repo.py Show resolved Hide resolved
mkdocs_git_authors_plugin/plugin.py Show resolved Hide resolved
mkdocs_git_authors_plugin/plugin.py Outdated Show resolved Hide resolved
mkdocs_git_authors_plugin/plugin.py Show resolved Hide resolved
mkdocs_git_authors_plugin/plugin.py Show resolved Hide resolved
mkdocs_git_authors_plugin/repo.py Show resolved Hide resolved
uliska added 12 commits March 3, 2020 19:05
We don't have to go through the member accessor function
when we have the repo reference as a function argument.
This method is only used once and can easily be avoided by
directly passing the sha argument to Commit._populate.

Incidentally: Move the handling of uncommitted lines
*into* Commit._populate
This is not related to the actual object but a simple conversion
function and therefore an optimal candidate for a generic "util" module
The previous name raised concerns about users being worried
"commit" might be used as a verb (i.e. creating commits in the
actual repository).
The configuration variables intended for localization have to
be removed because the functionality will at one point be
handled by localization. See timvink#14
@uliska
Copy link
Contributor Author

uliska commented Mar 3, 2020

I've gone through the review comments. Indeed Pull Requests should not be so heavy, but as said at one point I didn't see any chance for an in-between anymore.

I've marked many comments as "resolved", in most cases after addressing them with a commit. A number of comments are not immediately resolvable and I added comments and questions. One general remark: I've stated disagreement in quite a number of cases. Please don't get me wrong with these - I don't want to appear arrogant, and I don't want to fight over these issues. But it's important to clearly express differing opinions in order to be able to resolve them for the best. So I won't be offended if you disagree with my comments on your turn ...

There are two recurring issues where we either have different ideas about programming style or where you may not fully see my idea (which does not mean my idea is correct or the best solution).

A)
If I have a field

class MyClass:
    def __init__(self):
        self._field = 'XY'

I think the paradigm of encapsulation says it is always better to have a method MyClass.field() to access this than to directly access it through my_class_object._field. Therefore I would strongly suggest not to remove these "property" methods. Except in a few cases where they are not used at all, in which case I have removed them in new commits now.

B)
There are a number of methods that retrieve or create objects, which you seem to find somewhat unclear.
The more I thought about this I'm sure it's the right way to approach the task in the context of this plugin. It is a fairly common pattern for accessing complex and/or expensive objects. Create them upon first request and then store them.
I think it's also straightforward from a semantic perspective. I just take one example, but it's basically the same throughout: When processing the git blame data for a page I encounter a Git SHA for each line. To deal with it I need to fetch the Commit object corresponding to that SHA. So I'm asking for it, and if this SHA has been used already there's an object available, otherwise it will created (and the actual Git commit processed) at the time when I need the object for the first time.
I think this is not less clear than the more explicit way of asking for the object and if it's not there already create it in a separate step.

@timvink
Copy link
Owner

timvink commented Mar 4, 2020

But it's important to clearly express differing opinions in order to be able to resolve them for the best.

I agree, and thanks for all your comments. I really appreciate the conversation !

Indeed some discussions about different programming styles, but educational for me as well.

I resolved most discussions, couple of points where we need to think a bit more. Getting there!

This is necessary to get a hold on the whole element
through CSS, for example to make it invisible (in certain
contexts).
It should be clear to any future contributor that when the
on_page event stage is reached all pages have already been
parsed and the repo-wide statistics are available.
This is to separate concerns: Move the formatting of the
list of authors to the repo not in the Git classes but separately.
@uliska uliska mentioned this pull request Mar 4, 2020
5 tasks
Copy link
Contributor Author

@uliska uliska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and thanks for all your comments. I really appreciate the conversation !

Indeed some discussions about different programming styles, but educational for me as well.

I have the impression that we both are completely ready to accept the other's arguments, which is a good basis for collaboration.

I will have to leave now and hopefully find more time tonight on the git-blame issue.

Closes timvink#20

While parsing the --porcelain output of git blame is more complicated
than that of git blame -lts

* it should be considered more robust
* it provides substantially more information:
  - we can get all information about the commits that we need
  - this makes it obsolete to call `git show` on the commits

The format of the commit timestamp is different in this command,
therfore this had to modify the datetime processing functions,
which were moved to util.py at the same time.
@timvink timvink merged commit 55a86d8 into timvink:master Mar 4, 2020
@timvink
Copy link
Owner

timvink commented Mar 4, 2020

Merged! 🎉 Amazing work @uliska !

@uliska uliska deleted the preload-blame branch March 4, 2020 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants