-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix column order validation #182
Conversation
WalkthroughThe changes in the pull request focus on the Changes
Assessment against linked issues
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
sdrf_pipelines/sdrf/sdrf_schema.py (1)
Line range hint
295-302
: Improve Variable Naming for Clarity.The variables
temp
anderror
are generic and could be renamed to more descriptive names to enhance code clarity.For example:
- Rename
temp
tofactor_columns
.- Rename
error
tomisplaced_columns
.Apply this diff to update the variable names:
if factor_tag: - temp = [] - error = [] + factor_columns = [] + misplaced_columns = [] for column in cnames[factor_index:]: - if "comment" in column or "characteristics" in column: - error.extend(temp) - temp = [] + if "comment" in column or "characteristics" in column: + misplaced_columns.extend(factor_columns) + factor_columns = [] elif "factor value" in column: - temp.append(column) + factor_columns.append(column) if len(error): - error_message = "The following factor column should be last: {}".format(", ".join(error)) + error_message = "The following factor columns should be last: {}".format(", ".join(misplaced_columns)) error_columns_order.append(LogicError(error_message, error_type=logging.ERROR))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- sdrf_pipelines/sdrf/sdrf_schema.py (1 hunks)
🧰 Additional context used
🔇 Additional comments (4)
sdrf_pipelines/sdrf/sdrf_schema.py (4)
283-283
: Variable Renaming Enhances Clarity: Change Approved.Renaming
index
toassay_index
improves code readability by explicitly indicating that this index corresponds to the "assay name" column.
285-285
: Use ofenumerate
Improves Code Efficiency and Readability.Switching to
enumerate
for iterating over column names simplifies index tracking and enhances code readability.
286-287
: Validation Logic Correctly Enforces "comment" Columns Position.The condition properly checks that "comment" columns do not appear before the "assay name" column, ensuring the column order is validated as per the schema.
291-292
: Inclusion of "technology type" in Order Validation Strengthens Schema Enforcement.Adding "technology type" to the validation logic ensures that this column is not positioned after the "assay name," aligning with the intended column order.
sdrf_pipelines/sdrf/sdrf_schema.py
Outdated
or "technology type" in column) and idx > assay_index: | ||
error_message = "The column " + column + " cannot be after the assay name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor Conditional Statement for Improved Readability.
The complex conditional can be refactored for better readability by restructuring the conditions.
Consider reorganizing the condition:
if idx > assay_index and (
"characteristics" in column
or ("material type" in column and "factor value" not in column)
or "technology type" in column
):
This places the index check upfront and groups the column-related conditions together, making the logic clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
sdrf_pipelines/sdrf/sdrf_schema.py (1)
289-289
: Use f-strings for constructing error messagesFor improved readability and consistency, consider using f-strings when constructing error messages throughout the validation logic.
Apply these diffs:
# Line 289 - error_message = "The column " + column + " cannot be before the assay name" + error_message = f"The column {column} cannot be before the assay name" # Line 292 (after applying previous suggestion) - error_message = f"The column {column} cannot be before the assay name" + error_message = f"The column {column} cannot be before the assay name" # Line 302 - error_message = "The column " + column + " cannot be after the assay name" + error_message = f"The column {column} cannot be after the assay name" # Line 305 - error_message = "The column " + column + " must be immediately after the assay name" + error_message = f"The column {column} must be immediately after the assay name"Also applies to: 292-292, 302-302, 305-305
User description
Fixes #177.
list.index
calls withenumerate
Raise error when "technology type" is after "assay name" and not beforeImplement the new decision (Major PR to update some inconsistencies in the specification. proteomics-sample-metadata#726) wheretechnology type
should be immediately afterassay name
but it is only a warning if it is immediately before.PR Type
Bug fix
Description
list.index
calls withenumerate
to improve performance in column order validation.Changes walkthrough 📝
sdrf_schema.py
Improve column order validation logic and error messages
sdrf_pipelines/sdrf/sdrf_schema.py
list.index
calls withenumerate
for efficiency.name".
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Bug Fixes