Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor conda-forge.yml linting logic, hint about extra fields #1900

Closed
wants to merge 13 commits into from

Conversation

ytausch
Copy link
Contributor

@ytausch ytausch commented Apr 2, 2024

Checklist

  • Added a news entry

Supersedes #1865 .

recipe-lint now hints about additional fields as compared with the Pydantic schema. While working on it, I cleaned up the linting logic. A new PR will follow for recipe lints.

Cc @0xbe7a

@ytausch ytausch marked this pull request as ready for review April 2, 2024 14:37
@ytausch ytausch requested a review from a team as a code owner April 2, 2024 14:37
Copy link
Member

@beckermr beckermr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has API changes. We should avoid them unless we are ready to go around fixing everything so we can bump to smithy v4.

@ytausch
Copy link
Contributor Author

ytausch commented Apr 2, 2024

Which changes do you mean? I think the Python API changes that include lintify_forge_yaml?

I can re-add them with the old type signature and raise a deprecation warning.

Is there a deprecation process for conda-smithy? How do I mark functions for deletion in v4?

@beckermr
Copy link
Member

beckermr commented Apr 2, 2024

I mean any public function in the package. We cannot just delete them at will.

I am not personally in favor of adding a deprecation process to smithy. The old APIs are just fine. However, the discussion of API changes and smithy v4 is something the core group as whole has to agree on.

All this PR needs to do is add hints on extra fields.

If you want to propose changes or cleanups to other functions, please use a separate PR.

@ytausch
Copy link
Contributor Author

ytausch commented Apr 2, 2024

I understand that public functions need to stay intact. This is why I want to re-add them with a deprecation notice.

In my opinion, a deprecation process is needed to keep a modern API. A lot of things have changed since this code was initially written, and a lot of things are historic. Refactoring is part of keeping the code in a maintainable state. Here is a list of the things of the public API I want to change, together with my rationale for it:

  • lint_recipe.str_type is a redundant alias for str, it was added for supporting Python 2 and 3 (get_section: check ABCs instead of list/dict subclass #871) but we don't need it anymore
  • lintify is not an English word
  • returning a tuple of two lists (lints, hints) is error-prone because you can very easily switch them around (in fact, I did it myself while working on it and I also saw some confusion about it in conda-smithy itself) - the data class approach resolves this while adding some elegance for adding lints together

Besides from that:

  • _format_validation_msg is not part of the public API

Also, I think the new Python consensus is that pathlib is superior of the old os.path functions. As far as I see though, making this change only affected new code and test code, so no breaking API changes.

The reason why I included all this in this PR (and not a separate one) is that I don't want to make our technical debt worse by building more stuff upon it. All of my changes directly interfere with the code I am adding, so IMO there are no unrelated changes.

@ytausch
Copy link
Contributor Author

ytausch commented Apr 2, 2024

This PR should now contain no breaking API changes anymore.

@ytausch ytausch requested a review from beckermr April 2, 2024 16:44
@ytausch
Copy link
Contributor Author

ytausch commented Apr 3, 2024

Related to my "fixing things as you see them" idea: https://softwareengineering.stackexchange.com/q/244807

@ytausch ytausch mentioned this pull request Apr 5, 2024
1 task
Comment on lines 60 to 81
def lint_extra_fields(
forge_yaml: dict,
) -> LintsHints:
"""
Identify unexpected keys in the conda-forge.yml file.
This only works if extra="allow" is set in the Pydantic sub-model where the unexpected key is found.
"""

config = ConfigModel.model_validate(forge_yaml)
hints = []

def _find_extra_fields(model: BaseModel, prefix=""):
for extra_field in (model.__pydantic_extra__ or {}).keys():
hints.append(f"Unexpected key {prefix + extra_field}")

for field, value in model:
if isinstance(value, BaseModel):
_find_extra_fields(value, f"{prefix + field}.")

_find_extra_fields(config)

return LintsHints(hints=hints)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an additional clarification: This function is the only functional change that this PR introduces, together with its tests in TestLintExtraFields and the changes in schema.py.

Everything else is refactoring.

@ytausch ytausch force-pushed the hint-missing-platforms branch from de7d64a to fea25db Compare April 5, 2024 17:04
@ytausch
Copy link
Contributor Author

ytausch commented Apr 24, 2024

@xhochy @beckermr Was there any discussion about how these (technically) API changes should be handled? I would like this PR to move forward and am happy to contribute to a discussion if it hasn't happened yet.

xref #1906 (comment)

from typing import List

import jsonschema
from pydantic import BaseModel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a new dependency pydantic on users. Previously it was only a dev dependency. Is it possible to remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like and strongly propose to keep pydantic here. Pydantic provides an elegant way to find additional fields that are not part of the schema (see my implementation) - and also, I think the pydantic model should be used in more parts of the smithy code anyway since it provides a type-safe way to access fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly disagree. This PR use pydantic to check for which fields to hint about. If we can't do that using the json schema we are essentially making a distinction between pydantic model and json schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with "distinction"? The pydantic model should be equivalent to the JSON schema anyway - if it's not, we have a bug. I am not aware of a good way of doing this with JSON schema.

Copy link
Contributor Author

@ytausch ytausch May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isuruf Let's continue this in #1920.

@isuruf
Copy link
Member

isuruf commented Apr 25, 2024

Can we please use one PR to do one thing? The refactoring shouldn't have to get blocked on hint about extra fields.

@ytausch
Copy link
Contributor Author

ytausch commented Apr 26, 2024

Can we please use one PR to do one thing? The refactoring shouldn't have to get blocked on hint about extra fields.

Adding to my comment above: My fear is that refactorings simply don't happen if the policy is as strict as you propose. However, since you seem to really want separate PRs, I will try to separate refactorings and functional changes more into separate PRs since I see a point in being able to review easily. Here, I pointed out the only functional change above.

@xhochy
Copy link
Member

xhochy commented Apr 26, 2024

However, since you seem to really want separate PRs, I will try to separate refactorings and functional changes more into separate PRs since I see a point in being able to review easily.

Just to second this: For me, as someone who only "briefly" knows the codebase, this PR is quite hard to review to the mix of refactoring and functional changes. I would be able to give better feedback if only small things change. You can make larger PRs to illustrate the context, but we should aim to have small ones that are easily reviewable and thus can be merged quite quickly.

@ytausch ytausch force-pushed the hint-missing-platforms branch from 4b7999b to aa7e161 Compare April 26, 2024 11:56
@ytausch
Copy link
Contributor Author

ytausch commented Apr 26, 2024

@isuruf @beckermr Since our discussion so far was very general (which is fine) I would like to know what exactly needs to happen with these changes here so that you are able to review them. Is it sufficient to incorporate the things we discussed in new PRs or is some finer grained separation also necessary here?

@jaimergp
Copy link
Member

Agreed with others that there are many changes (which don't get me wrong, they are welcome!) that get in the way of reviewing this PR with proper attention. I tuned out a couple times after a few os.path.join removals.

I'm in favor of having these QoL improvements done but they should be easy to review (e.g. one type of change per commit), and scoped in their own PR. These refactors are usually easy to review when the PR only changes that and doesn't change behavior.

My advice here is to split this PR into two PRs:

  • The minimal changeset that implements the extra field hints.
  • A general QoL changes PR, possibly supported by tooling (e.g. ruff, pyupgrade, etc). The idea here would be to agree on some config to apply automatically to the codebase and then regenerate the PR as necessary.

Let me know if you need assistance!

@ytausch
Copy link
Contributor Author

ytausch commented May 3, 2024

The functional part of this PR moved to #1920. I will clean this up here, please move any discussion about these parts there.

@ytausch ytausch closed this May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants