feat: add shared LOC bounds configuration and validation by vitali87 · Pull Request #28 · vitali87/pr-split

vitali87 · 2026-03-24T10:49:00Z

Summary

add shared min/max LOC bounds configuration and persisted plan metadata
add structured LOC bound violation detection plus strict split-time enforcement
expose underflow metrics and document the new CLI/env options

Part of #6.

Testing

uv run ruff check pr_split tests
uv run pytest -q

gemini-code-assist · 2026-03-24T10:49:39Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the pr-split tool's ability to manage the size of generated sub-PRs. It introduces granular control over lines of code (LOC) by allowing users to specify both minimum and maximum bounds. Furthermore, it provides a strict enforcement mechanism for these bounds and improves the reporting of plan metrics to include underflow, offering a more comprehensive understanding of the split plan's adherence to desired size constraints.

Highlights

Lines of Code (LOC) Bounds Configuration: Introduced new CLI options (--min-loc, --strict-loc-bounds) and environment variables (PR_SPLIT_MIN_LOC, PR_SPLIT_STRICT_LOC_BOUNDS) to allow users to define both minimum and maximum LOC bounds for sub-PRs, enhancing control over PR size.
Strict LOC Bound Enforcement: Implemented a strict mode (--strict-loc-bounds) that causes the pr-split tool to exit if the final plan violates the configured LOC bounds, ensuring adherence to defined limits.
Structured LOC Violation Detection and Metrics: Added structured detection for LOC bound violations, distinguishing between groups below the minimum and above the maximum. New 'underflow' metrics are now exposed in the plan summary, providing better insights into PR sizing.
Persisted Plan Metadata: The new min_loc and strict_loc_bounds settings are now persisted as part of the split plan metadata, ensuring consistency across runs and better traceability.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

greptile-apps · 2026-03-24T10:53:30Z

Greptile Summary

This PR adds shared min_loc/max_loc/strict_loc_bounds configuration to the Settings model, persists those bounds in the saved SplitPlan, and introduces structured LOC-bound violation detection with a new LocBoundViolation named-tuple. Violations are now surfaced as either non-fatal warnings (default) or hard exits (--strict-loc-bounds).

Key changes:

Settings / config.py: adds min_loc (Optional[int], ge=1), strict_loc_bounds, and a validate_loc_bounds model validator that enforces min_loc < max_loc; all ValueError-raising validators now correctly surface as pydantic.ValidationError and are caught accordingly in both cli.py and tests/test_config.py.
validator.py: splits validate_loc_bounds into detect_loc_bound_violations (returns structured LocBoundViolation objects) + formatting helpers, adding BELOW_MIN detection alongside the existing ABOVE_MAX path.
scoring.py: adds loc_underflow and undersized_groups to PlanMetrics and folds them into the objective function. loc_underflow is logged in PLAN_METRICS; undersized_groups is computed but not yet surfaced in the log message.
cli.py: new _handle_loc_bound_warnings helper centralises warn-vs-exit logic; Settings construction is wrapped in try/(ValidationError, ValueError) for a clean CLI error message.
Tests: comprehensive coverage added for all new paths, including env-var wiring, strict-mode exit, post-edit re-validation, and plan serialisation round-trip.

Confidence Score: 4/5

PR is safe to merge; the one logic-level gap (undersized_groups never logged) is a missing observability feature, not a correctness defect.
All previously flagged issues (duplicate warning output, ValueError vs ValidationError, help-text inconsistency) have been resolved. The ValidationError import and catch are correct. _handle_loc_bound_warnings behaves correctly in both branches and the typer.Exit it raises correctly escapes the surrounding except PRSplitError handler. One minor gap: undersized_groups is computed in score_plan and returned in PlanMetrics but is never logged or otherwise surfaced, unlike loc_underflow which was correctly wired into the log template. The test mock in test_client.py is also missing this field, which will become a latent AttributeError once the log call is updated.
pr_split/planner/scoring.py and pr_split/planner/client.py — undersized_groups metric needs to be added to the PLAN_METRICS log format to match the treatment of loc_underflow.

Important Files Changed

Filename	Overview
pr_split/cli.py	Adds `--min-loc`, `--max-loc` re-wiring and `--strict-loc-bounds` CLI options; properly catches `ValidationError` from Settings construction; introduces `_handle_loc_bound_warnings` helper used consistently in both initial and post-edit validation paths.
pr_split/config.py	Adds `min_loc`, `strict_loc_bounds` fields and `validate_loc_bounds` model validator; correctly uses `Field(ge=1)` on `int
pr_split/planner/scoring.py	Introduces `loc_underflow` and `undersized_groups` metrics. `loc_underflow` is correctly added to the log format; `undersized_groups` is computed but never logged or surfaced anywhere — likely an oversight.
pr_split/planner/validator.py	Refactors to separate structured `LocBoundViolation` detection from string formatting; adds `BELOW_MIN` violation type alongside the existing `ABOVE_MAX` path.
pr_split/schemas.py	Persists `min_loc` and `strict_loc_bounds` into `SplitPlan`; backward-compatible defaults (`None` and `False`) preserve existing serialised plans.
tests/test_client.py	Mock for `score_plan` was updated to add `loc_underflow` but is missing the new `undersized_groups` attribute; will cause `AttributeError` once that field is wired into the log call.
tests/test_config.py	All existing tests updated to `pytest.raises(ValidationError, ...)` correctly; new `TestSettingsLocBoundsValidation` class covers the new min/max validation logic comprehensively.
tests/test_cli_new_features.py	Adds well-structured tests for `_handle_loc_bound_warnings` and the new CLI env-var/strict-mode paths; correctly asserts `save_plan` is not called when strict mode exits early.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["CLI: split()"] --> B["Build Settings\n(min_loc, max_loc, strict_loc_bounds)"]
    B -->|"ValidationError"| C["Print error & Exit(1)"]
    B --> D["plan_split(parsed_diff, settings)"]
    D --> E["score_plan(groups, max_loc, min_loc)\n→ PlanMetrics incl. loc_underflow, undersized_groups"]
    E --> F["validate_plan(groups, …, max_loc, min_loc)"]
    F --> G["detect_loc_bound_violations()\n→ list[LocBoundViolation]"]
    G --> H["validate_loc_bounds()\n→ list[str warnings]"]
    H --> I["_handle_loc_bound_warnings(warnings, strict_loc_bounds)"]
    I -->|"strict=True & warnings"| J["Print violations & Exit(1)"]
    I -->|"strict=False or no warnings"| K["logger.warning each\n(non-fatal)"]
    K --> L["_interactive_edit()"]
    L --> M["Re-validate (same path F→I)"]
    M -->|"Exit(1)"| N["Abort — save_plan NOT called"]
    M -->|"OK"| O["SplitPlan(min_loc, max_loc, strict_loc_bounds)\nsave_plan / create PRs"]

Comments Outside Diff (3)

pr_split/planner/scoring.py, line 52-57 (link)

undersized_groups computed but never logged

undersized_groups is a new field added to PlanMetrics alongside loc_underflow, but only loc_underflow was wired into the PLAN_METRICS log template (logs.py) and the logger.info call in client.py. undersized_groups is computed here and returned in the dataclass but is never surfaced to the user anywhere — not in logs, not in console output. This is inconsistent with how loc_underflow was introduced (it was both added to the dataclass and to the log message).

To surface it, update logs.PLAN_METRICS and client.py's logger call:
```
# logs.py
PLAN_METRICS = (
    "Plan metrics: groups={groups}, max_group_loc={max_loc}, underflow={underflow}, "
    "undersized={undersized}, overflow={overflow}, width={width}, depth={depth}, "
    "scatter={scatter}, objective={objective}"
)
```
```
# client.py
logger.info(
    logs.PLAN_METRICS.format(
        ...
        underflow=metrics.loc_underflow,
        undersized=metrics.undersized_groups,
        overflow=metrics.loc_overflow,
        ...
    )
)
```
tests/test_client.py, line 237-250 (link)

Mock missing undersized_groups attribute

The score_plan mock return value was updated to add loc_underflow, but is missing the equally new undersized_groups field that was added to PlanMetrics in this PR. Currently this doesn't cause a test failure because client.py's logger call doesn't access metrics.undersized_groups, but once undersized_groups is wired into the log call (see the scoring.py comment), the test will raise AttributeError at runtime.
pr_split/planner/validator.py, line 68-97 (link)

A group can generate two violations simultaneously

detect_loc_bound_violations appends a BELOW_MIN violation when estimated_loc < min_loc, and then separately appends an ABOVE_MAX violation when estimated_loc > max_loc. These two conditions are mutually exclusive only because the Settings validator enforces min_loc < max_loc. If that constraint ever changes or if a caller bypasses Settings, a single group could appear in the returned list twice — once for each bound — producing duplicate warning messages.

An explicit elif would make the mutual exclusivity self-documenting and defensive:
```
for group in groups:
    if min_loc is not None and group.estimated_loc < min_loc:
        violations.append(...)
    elif group.estimated_loc > max_loc:
        violations.append(...)
```

Prompt To Fix All With AI

This is a comment left during a code review.
Path: pr_split/planner/scoring.py
Line: 52-57

Comment:
**`undersized_groups` computed but never logged**

`undersized_groups` is a new field added to `PlanMetrics` alongside `loc_underflow`, but only `loc_underflow` was wired into the `PLAN_METRICS` log template (`logs.py`) and the `logger.info` call in `client.py`. `undersized_groups` is computed here and returned in the dataclass but is never surfaced to the user anywhere — not in logs, not in console output. This is inconsistent with how `loc_underflow` was introduced (it was both added to the dataclass and to the log message).

To surface it, update `logs.PLAN_METRICS` and `client.py`'s logger call:

```python
# logs.py
PLAN_METRICS = (
    "Plan metrics: groups={groups}, max_group_loc={max_loc}, underflow={underflow}, "
    "undersized={undersized}, overflow={overflow}, width={width}, depth={depth}, "
    "scatter={scatter}, objective={objective}"
)
```

```python
# client.py
logger.info(
    logs.PLAN_METRICS.format(
        ...
        underflow=metrics.loc_underflow,
        undersized=metrics.undersized_groups,
        overflow=metrics.loc_overflow,
        ...
    )
)
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/test_client.py
Line: 237-250

Comment:
**Mock missing `undersized_groups` attribute**

The `score_plan` mock return value was updated to add `loc_underflow`, but is missing the equally new `undersized_groups` field that was added to `PlanMetrics` in this PR. Currently this doesn't cause a test failure because `client.py`'s logger call doesn't access `metrics.undersized_groups`, but once `undersized_groups` is wired into the log call (see the scoring.py comment), the test will raise `AttributeError` at runtime.

```suggestion
        mock_score.return_value = type(
            "Metrics",
            (),
            {
                "total_groups": 1,
                "max_group_loc": 1,
                "loc_underflow": 0,
                "loc_overflow": 0,
                "dag_width": 1,
                "dag_depth": 1,
                "file_scatter": 0,
                "undersized_groups": 0,
                "objective": 1,
            },
        )()
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: pr_split/planner/validator.py
Line: 68-97

Comment:
**A group can generate two violations simultaneously**

`detect_loc_bound_violations` appends a `BELOW_MIN` violation when `estimated_loc < min_loc`, and then separately appends an `ABOVE_MAX` violation when `estimated_loc > max_loc`. These two conditions are mutually exclusive only because the `Settings` validator enforces `min_loc < max_loc`. If that constraint ever changes or if a caller bypasses `Settings`, a single group could appear in the returned list twice — once for each bound — producing duplicate warning messages.

An explicit `elif` would make the mutual exclusivity self-documenting and defensive:

```python
for group in groups:
    if min_loc is not None and group.estimated_loc < min_loc:
        violations.append(...)
    elif group.estimated_loc > max_loc:
        violations.append(...)
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (4): Last reviewed commit: "fix: align loc bounds CLI with validatio..." | Re-trigger Greptile}

pr_split/cli.py

gemini-code-assist · 2026-03-24T10:53:56Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

vitali87 · 2026-03-24T11:04:34Z

@greptileai review

greptile-apps bot reviewed Mar 24, 2026

View reviewed changes

pr_split/cli.py Show resolved Hide resolved

pr_split/cli.py Show resolved Hide resolved

pr_split/cli.py Outdated Show resolved Hide resolved

feat: add shared LOC bounds configuration and validation

8591f42

vitali87 force-pushed the feat/loc-bounds-foundation branch from b55b07d to 8591f42 Compare March 24, 2026 10:58

fix: align loc bounds CLI with validation behavior

10f3607

vitali87 merged commit 4c6c727 into main Mar 24, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add shared LOC bounds configuration and validation#28

feat: add shared LOC bounds configuration and validation#28
vitali87 merged 2 commits intomainfrom
feat/loc-bounds-foundation

vitali87 commented Mar 24, 2026

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

greptile-apps bot commented Mar 24, 2026 •

edited

Loading

Comments Outside Diff (3)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

vitali87 commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vitali87 commented Mar 24, 2026

Summary

Testing

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

greptile-apps bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (3)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

vitali87 commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps bot commented Mar 24, 2026 •

edited

Loading