-
Notifications
You must be signed in to change notification settings - Fork 4
Revise config model #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise config model #175
Conversation
… annotations Redefine `_parent_run_directory` and `_run_directory` of `RunConfig` as private properties since they are computed based on the value of a field of the model and are never modified. Additionally, `None` type annotations are removed from these properties for they are really never `None`
… annotation Set the default of `run_id` using the default factory mechanism. This allows a more declarative definition of the model and `model_post_init` should be reserved for setting something that requires multi-field access. Additionally, the `None` type annotation is removed. `run_id` was never `None` once it was set. The `None` value was only used as an indication of absence of user provided value and to signal the setting of a default. Setting the default of `run_id` using the default factory mechanism obviates the `None` type annotation.
…one` type annotation Set the default of `cache_directory` using the default factory mechanism. This allows a more declarative definition of the model and `model_post_init` should be reserved for setting something that requires multi-field access. Additionally, the `None` type annotation is removed. `cache_directory` was never `None` once it was set. The `None` value was only used as an indication of absence of user provided value and to signal the setting of a default. Setting the default of `cache_directory` using the default factory mechanism obviates the `None` type annotation.
…ne` type annotation Set the default of `bids_directory` using the default factory mechanism. This allows a more declarative definition of the model and `model_post_init` should be reserved for setting something that requires multi-field access. Additionally, the `None` type annotation is removed. `bids_directory` was never `None` once it was set. The `None` value was only used as an indication of absence of user provided value and to signal the setting of a default. Setting the default of `bids_directory` using the default factory mechanism obviates the `None` type annotation.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## add_config #175 +/- ##
==============================================
- Coverage 87.56% 87.31% -0.25%
==============================================
Files 27 28 +1
Lines 965 962 -3
==============================================
- Hits 845 840 -5
- Misses 120 122 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| @property | ||
| def _parent_run_directory(self) -> pathlib.Path: | ||
| """The parent directory where all run-specific directories are stored.""" | ||
| return self.cache_directory / "runs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is only being set once, isn't it cleaner to set it when I originally did? (otherwise, this line gets re-called every time you access the property)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further, marking it as a PrivateAttr with the rest of the fields make it more obvious that is there and has all the other pydantic side effects you would expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is only being set once, isn't it cleaner to set it when I originally did? (otherwise, this line gets re-called every time you access the property)
It may be cleaner. However, what I see is that this value along with _run_directory are based on the value of another attribute in the model, and since this RunConfig is mutable (Pydantic models don't have true immutability), defining these two values as private attributes can render them outdated once the containing model object is mutated. Additionally, since these two values are based on the value of another attribute, it seems more appropriate to me that they are read-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further, marking it as a PrivateAttr with the rest of the fields make it more obvious that is there and has all the other pydantic side effects you would expect
I don't know which side effects you are referring to. The main point having a private attribute in a Pydantic model is that you can keep information in the model that behaves very differently than a regular field, https://docs.pydantic.dev/latest/concepts/models/#private-model-attributes.
…to"` value option Set the default of `file_mode` using the default factory mechanism. This allows a more declarative definition of the model and `model_post_init` should be reserved for setting something that requires multi-field access. Additionally, the `"auto"` value option is removed. `file_mode` was never `"auto"` once it was set. The `"auto"` value was only used as an indication of absence of user provided value and to signal the setting of a default. Setting the default of `file_mode` using the default factory mechanism obviates the `"auto"` value option and allows the type annotation of the `file_mode` field to be consistent with the possible values that it holds in operation.
…dator Doing it this way makes the definition of the model more declarative and also avoids using `model_post_init` which should be preserved for validations that requires access to multiple fields of the model
Doing this allows CLI arguments to remain the same to users while making the `RunConfig` more succinct. For example, `RunConfig.run_id` can now be of type `str` instead of `str | None` even though it doesn't hold a value of `None` in operation.
When validating a directory as a BIDS directory, ensure it is a file instead of just ensuring its mere existence. This rules out the situation where `dataset_description.json` is a directory
…json file when validating a directory as a BIDS directory
| from .._core._run_id import _generate_run_id | ||
|
|
||
|
|
||
| def _validate_existing_directory_as_bids(directory: pathlib.Path) -> pathlib.Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well move this out like you did the file mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(into a separate file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that. I don't have a strong preference. Do you want to _core?
However, I want you to know that _validate_existing_directory_as_bids is a field validator in the RunConfig now. The argument it takes, the values it returns, and the errors it raises, are dictated by Pydantic. I.e. It may not have any other use, so it may be a good idea to just keep it close the model that uses it, as it is really part of the model in some sense.
| file_mode : one of "move", "copy", "symlink", or "auto", default: "auto" | ||
| file_mode : one of "move", "copy", or "symlink" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yarik specifically requested the "auto" value previously
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can re-add it if he really wants it, but the it will not be a consistent model.
I preserved "auto" as an option in the CLI, so it is as user friend to the CLI users. Adding it into the config model doesn't make much sense since once an a model object is initialized, the field never hold a value of "auto". This behavior is actually reflected in your code.
nwb2bids/src/nwb2bids/_converters/_session_converter.py
Lines 128 to 133 in de9064b
| if file_mode == "copy": | |
| shutil.copy(src=nwbfile_path, dst=session_file_path) | |
| elif file_mode == "move": | |
| shutil.move(src=nwbfile_path, dst=session_file_path) | |
| elif file_mode == "symlink": | |
| session_file_path.symlink_to(target=nwbfile_path) |
There is no checking for the "auto" value in that example, which is appropriate.
@yarikoptic What do you think? It may not be what you want initially, but I think it works better.
|
Cool, looks like you got it working! Some minor comments above |
It is quite a bit of tweaks, but I think it is worth it since this model permeates through out the codebase. It will help simplify the code or/and avoid quite a bit of mypy errors down the road. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the RunConfig class to use Pydantic's field validators and default factories for better validation and initialization, replacing the previous approach that used None defaults with post-initialization logic.
Key changes:
- Extracted file mode detection and BIDS directory validation into standalone functions
- Converted optional fields with
Nonedefaults to usedefault_factorywith validators - Updated command-line interface to filter out unset parameters before validation
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/nwb2bids/_core/_file_mode.py | New module containing _determine_file_mode() function extracted from RunConfig |
| src/nwb2bids/_converters/_run_config.py | Refactored to use Pydantic validators and default factories; extracted validation logic to standalone function; converted private attributes to properties |
| src/nwb2bids/_command_line_interface/_main.py | Updated to filter CLI parameters before passing to RunConfig.model_validate() |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
|
🚀 PR was released in |
The main goal of this PR is to make the
RunConfigmodel introduced in #164 more succinct and declarative.To accomplish this goal, it
Nonetype annotations for thebids_directory,cache_directory, andrun_idfields. Annotating these fields asNoneis inconsistent with their behavior since these fields never hold aNonevalue once the containing model object is initialized. The codebase also expects these values to be non-None as innwb2bids/src/nwb2bids/_converters/_run_config.py
Line 63 in 4028e5d
Noneannotations are not removed, the code above would raise amypyerror sinceNonedoesn't support the/operator."auto"for thefile_modefield . Similar to situation described in the previous point, annotating thefile_modefield with a possible value of"auto"contradicts with the behavior that thefile_modefield is never"auto"once the containing object is initialized. Our codebase already depends on this behavior,nwb2bids/src/nwb2bids/_converters/_session_converter.py
Lines 128 to 133 in de9064b
"auto"value is never checked in that code)._parent_run_directoryand_run_directoryas properties. These two "properties" should be more appropriately defined as properties since that are calculated based on other attributes and should be updated when those other attributes change. They should also be read-only.default_factoryof the respective field instead of invoking them in themodel_post_init()hook method. This help make the model more declarative._validate_existing_directory_as_bidsto be used as a field validator. This help make the model more declarative and allows raising validation errors with proper location.Aside from accomplishing the main goal, this PR incidentally made the following minor improvements in validation of existing directory as a BIDS directory.
dataset_description.jsonexists as a file. Raise proper error otherwise.dataset_description.jsondoesn't contain proper JSON