Skip to content

fix: implement robust PR URL validation and sanitization#262

Open
ojaswa072 wants to merge 9 commits intoOWASP-BLT:mainfrom
ojaswa072:fix/pr-url-input-validation
Open

fix: implement robust PR URL validation and sanitization#262
ojaswa072 wants to merge 9 commits intoOWASP-BLT:mainfrom
ojaswa072:fix/pr-url-input-validation

Conversation

@ojaswa072
Copy link
Copy Markdown
Contributor

@ojaswa072 ojaswa072 commented Feb 27, 2026

Summary

This PR implements comprehensive input validation for GitHub Pull Request URLs to improve security (mitigating SSRF risks) and enhance the user experience.

Key Changes:

  • Frontend Validation: Added a regex check in public/index.html to provide immediate feedback to users, preventing invalid network requests.
  • Backend Hardening: Updated src/utils.py with an anchored regex and length constraints to block malformed or malicious URLs.
  • Graceful Error Handling: Modified src/handlers.py to catch ValueError exceptions and return a 400 Bad Request instead of allowing a server-side crash.

Testing & Verification:

  • Environment: Tested on Windows (manual D1 migration execution).
  • Validation: Confirmed that non-GitHub URLs and non-PR GitHub links (issues/commits) trigger the appropriate error messages.
  • Bypass Check: Verified via curl.exe that the API correctly rejects direct malformed POST requests.

Before-

before

After-

after

Summary by CodeRabbit

  • New Features

    • Canonical GitHub URL parsing/normalization for PR/repo/org inputs; ?pr= now prefills with canonical PR URL and editing clears URL errors.
    • Dedicated (hidden) URL error element for clearer, accessible validation feedback.
  • Bug Fixes

    • Enforces composite PR identity to prevent duplicates and standardizes stored canonical URLs.
    • Rejects incompatible bulk vs single-PR submissions with clearer handling.
  • Tests

    • Added URL-validation test suite, integrated into main test command; improved reporting and skip handling.

@github-actions
Copy link
Copy Markdown

🍃 PR Readiness Check

Check the readiness of this PR on Leaf:
👉 Open on Leaf

Leaf reviews pull requests for operational readiness, security risks, and production-impacting changes before they ship.

@github-actions github-actions bot added the files-changed: 3 PR changes 3 files label Feb 27, 2026
@owasp-blt owasp-blt bot added the needs-peer-review PR needs peer review label Mar 24, 2026
@owasp-blt
Copy link
Copy Markdown

owasp-blt bot commented Mar 24, 2026

👋 Hi @ojaswa072!

This pull request needs a peer review before it can be merged. Please request a review from a team member who is not:

  • The PR author
  • coderabbitai
  • copilot

Once a valid peer review is submitted, this check will pass automatically. Thank you!

⚠️ Peer review enforcement is active.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds canonical GitHub URL parsing/validation in frontend and server, sends canonical PR URLs from the UI, migrates DB to composite identity (repo_owner, repo_name, pr_number) with canonical_url persistence and deduplication, updates upsert/lookup logic, and adds a URL-validation test script and migration.

Changes

Cohort / File(s) Summary
Frontend — page script & markup
public/index.html
Adds hidden #url-error UI, setUrlError(message), parseGitHubTrackingUrl(rawUrl), input listener to clear errors; refactors addPr() to use parser, send pr_url: canonicalUrl and add_all: effectiveAddAll, and identify added PRs by identity via findPrByIdentity.
Client tests & npm scripts
test-url-validation.js, package.json
Adds test-url-validation.js that extracts and tests parseGitHubTrackingUrl in a Node vm; updates package.json test to run data-display then url-validation and adds test:url-validation.
Server — handlers & utils
src/handlers.py, src/utils.py
Introduces compiled, anchored GH regexes; tightens parse_pr_url/parse_repo_url/parse_org_url (trim only, return canonical_url, enforce types); handle_add_pr trims input, prefers PR parsing first, derives add_all from URL type, upserts using canonical fields, and switches lookups/updates to composite-key (repo_owner, repo_name, pr_number).
Database layer & migration
src/database.py, migrations/0004_add_canonical_url_and_composite_uniqueness.sql
upsert_pr computes and persists canonical_url, uses ON CONFLICT(repo_owner, repo_name, pr_number) and updates pr_url/canonical_url from excluded; migration adds canonical_url, backfills values, deduplicates rows by composite identity, and creates a unique composite index.
Runtime tests & Windows handling
test-data-display.js
Expands test result statuses to support skipped, adds Windows bash preflight check and cmd.exe spawn path, introduces robust child-process termination, treats specific preflight failures as skipped tests, and updates final summary accounting.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Browser
    participant FrontendJS as Frontend JS
    participant Backend as /add_pr Handler
    participant ServerUtils as Server URL Parser
    participant DB as Database

    User->>Browser: enter URL, click Add
    Browser->>FrontendJS: submit URL
    FrontendJS->>FrontendJS: parseGitHubTrackingUrl(rawUrl)
    alt client validation fails
        FrontendJS-->>Browser: setUrlError(message)
    else client validation succeeds
        FrontendJS->>Backend: POST /add_pr { pr_url: canonicalUrl, add_all }
        Backend->>ServerUtils: parse_pr_url / parse_repo_url as needed
        alt server validation fails
            Backend-->>FrontendJS: error response
        else server validation succeeds
            Backend->>DB: upsert_pr(repo_owner, repo_name, pr_number, canonical_url, pr_url, ...)
            DB-->>Backend: upsert result
            Backend-->>FrontendJS: success { pr_url: canonicalUrl, details }
            FrontendJS-->>Browser: highlight newly added PR (by identity)
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

quality: medium

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective of the PR: implementing robust validation and sanitization for GitHub PR URLs across both frontend and backend.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
src/utils.py (1)

16-22: Remove instructional migration comments from production code.

The numbered “place this regex / replace old function” comments read like patch notes and can be dropped to keep source focused on behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils.py` around lines 16 - 22, Remove the instructional migration
comments and keep only the actual declarations: delete the two numbered comment
lines that say "1. Place this regex..." and "2. Replace the old parse_pr_url..."
so the module contains just the compiled regex _GITHUB_PR_RE = re.compile(...)
at top-level and the parse_pr_url(pr_url: str) function definition below it;
ensure no leftover patch-note style text remains in src/utils.py and run a quick
linter/formatter to confirm only functional comments (if any) remain.
src/handlers.py (1)

246-247: Use one consistent error response shape in handle_add_pr.

This branch returns {"ok": false, ...} while other branches in the same handler return {"error": ...} or {"success": ...}. A single envelope avoids extra client-side branching.

♻️ Suggested consistency fix
-                    json.dumps({'ok': False, 'error': str(e)}),
+                    json.dumps({'success': False, 'error': str(e)}),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers.py` around lines 246 - 247, In handle_add_pr, make the error
response shape consistent with the other branches by replacing the envelope
{"ok": False, "error": str(e)} with the same shape used elsewhere (e.g.,
{"error": str(e)}); update the json.dumps call that constructs the response and
ensure the status/headers remain unchanged so clients always receive the same
error envelope from handle_add_pr.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@public/index.html`:
- Around line 653-660: The current validation uses PR_PATH_RE and rejects any
url.pathname that doesn't match /owner/repo/pull/NUMBER, which incorrectly
blocks repository/org import URLs used by the bulk-import checkbox flow; update
the validation so it accepts the other valid patterns (e.g. /owner/repo and
/orgs/owner or /owner) instead of returning invalid for all non-pull paths:
either add additional regexes (e.g. REPO_PATH_RE and ORG_PATH_RE) and test
url.pathname against them before returning the invalid object, or broaden
PR_PATH_RE to recognize the repo/org forms, and keep the existing return {
valid: false, reason: ... } only when none of the allowed patterns match so
repo/org import flows are not rejected.
- Around line 672-686: Move the GitHub URL validation into the existing addPr()
flow (instead of adding a separate click handler) so the early return actually
prevents submission from all entry points including the Enter key; inside
addPr() read prUrl and addAll state, compute effectiveAddAll (consider org/repo
URLs as "addAll"), and if not effectiveAddAll call parseGitHubPRUrl(prUrl) and
on invalid use setUrlError(result.reason) (or set the 'url-error' element) and
input.focus() then return; clear the error before continuing and remove the
extra document.getElementById('addPrBtn').addEventListener('click', ...) handler
and any direct checkPR(raw) calls so submission only proceeds from addPr() after
validation.
- Line 571: The span with id "url-error" is hard-coded to display: none so the
new handler that only updates textContent never makes errors visible; change the
UI logic to not permanently hide that element and instead toggle its visibility
when setting/clearing the error (e.g., remove the inline display: none, or set
element.style.display = ''/'block' or toggle a CSS class from the URL validation
handler that sets textContent). Ensure the code that clears the message also
hides the element again so the span only appears when the validation handler
populates url-error.

In `@src/utils.py`:
- Around line 17-19: The regex _GITHUB_PR_RE currently requires the optional
suffix to start with '/' so query or fragment suffixes like '?diff=split' or
'#discussion...' are rejected; update the pattern for _GITHUB_PR_RE (and the
other similar regex at lines referenced) to allow optional suffixes that begin
with '/', '?' or '#' (e.g., replace the '(/.*)?$' portion with a non-capturing
alternative that accepts '/', '?' or '#' such as '(?:[/?#].*)?$'), keeping the
existing capture groups for owner, repo, and PR number unchanged.

---

Nitpick comments:
In `@src/handlers.py`:
- Around line 246-247: In handle_add_pr, make the error response shape
consistent with the other branches by replacing the envelope {"ok": False,
"error": str(e)} with the same shape used elsewhere (e.g., {"error": str(e)});
update the json.dumps call that constructs the response and ensure the
status/headers remain unchanged so clients always receive the same error
envelope from handle_add_pr.

In `@src/utils.py`:
- Around line 16-22: Remove the instructional migration comments and keep only
the actual declarations: delete the two numbered comment lines that say "1.
Place this regex..." and "2. Replace the old parse_pr_url..." so the module
contains just the compiled regex _GITHUB_PR_RE = re.compile(...) at top-level
and the parse_pr_url(pr_url: str) function definition below it; ensure no
leftover patch-note style text remains in src/utils.py and run a quick
linter/formatter to confirm only functional comments (if any) remain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: fdb1b398-e936-4b7b-9a65-12abf738853c

📥 Commits

Reviewing files that changed from the base of the PR and between 5fba838 and e8c0de1.

📒 Files selected for processing (3)
  • public/index.html
  • src/handlers.py
  • src/utils.py

@S3DFX-CYBER
Copy link
Copy Markdown
Contributor

@coderabbitai isn't this done in other any prs before

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for previously merged/closed PRs that touched URL validation or parse_pr_url
gh pr list --repo OWASP-BLT/BLT-Leaf --state all --limit 50 --json number,title,state,mergedAt \
  | jq '.[] | select(.title | test("url|validat|sanitiz|parse"; "i")) | {number, title, state, mergedAt}'

Length of output: 636


🏁 Script executed:

#!/bin/bash
# Get details of PR `#279` to see if it overlaps with `#262`
gh pr view 279 --repo OWASP-BLT/BLT-Leaf --json number,title,body,files

# Also check git log for src/utils.py
git log --oneline src/utils.py | head -20

# Check git log for src/handlers.py
git log --oneline src/handlers.py | head -20

Length of output: 3682


@S3DFX-CYBER — There is related (but distinct) prior work. Here's a breakdown:

PR Status What it validates
#279 (merged 2026-03-10, same author) ✅ Merged Validates the PR ID extracted from API path segments (e.g. /api/prs/123/timeline) in three sub-resource handlers — guards against IndexError and non-numeric IDs. Only touches src/handlers.py.
#262 (current, open) 🔓 Open Validates the full GitHub PR URL string entered by the user (e.g. https://github.com/owner/repo/pull/123) — adds a frontend regex, hardens parse_pr_url in src/utils.py, and converts ValueError → HTTP 400.

The git history for src/utils.py confirms that no prior PR added input validation to parse_pr_url — the first validation commit on that function is the one introduced by this PR (c2243c9).

So these two PRs are complementary, not duplicate: #279 handles internal path safety, while #262 handles the upstream user-facing URL input layer. This PR does introduce new work that wasn't covered before.

Copy link
Copy Markdown
Contributor

@S3DFX-CYBER S3DFX-CYBER left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the cr comments

@github-actions github-actions bot added files-changed: 5 PR changes 5 files and removed files-changed: 3 PR changes 3 files labels Mar 25, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
public/index.html (1)

672-686: ⚠️ Potential issue | 🔴 Critical

Remove redundant click handler that calls undefined checkPR function.

The button has duplicate click handlers: one at lines 673–686 (reviewed snippet) and another at line 3518. The first handler calls undefined checkPR(raw) on line 685, which will throw a ReferenceError at runtime. Additionally, parseGitHubPRUrl is only used within this redundant handler and is not called elsewhere. The correct handler at line 3518 uses addPr() which properly validates via parseGitHubTrackingUrl and handles all submission logic. Remove lines 673–686 entirely to fix the duplicate execution and undefined function call.

🐛 Proposed fix: remove redundant click handler
-// Attach to button
-document.getElementById('addPrBtn').addEventListener('click', () => {
-  const raw = document.getElementById('prUrlInput').value;
-  const errEl = document.getElementById('url-error');
-  errEl.textContent = '';
-
-  const result = parseGitHubPRUrl(raw);
-  if (!result.valid) {
-    errEl.textContent = result.reason;   // inline error, no alert()
-    document.getElementById('prUrlInput').focus();
-    return;
-  }
-
-  checkPR(raw);  // passes only after validation
-});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@public/index.html` around lines 672 - 686, The snippet registers a redundant
click handler on addPrBtn that uses parseGitHubPRUrl and calls an undefined
checkPR(raw), causing a runtime ReferenceError and duplicate behavior; delete
the entire handler that references parseGitHubPRUrl, checkPR, prUrlInput and
url-error (the block attaching the click listener to getElementById('addPrBtn')
shown in the diff) so only the correct handler that calls addPr() remains (the
handler around addPrBtn at line ~3518), ensuring validation and submission flow
use parseGitHubTrackingUrl/addPr.
🧹 Nitpick comments (1)
public/index.html (1)

641-670: parseGitHubPRUrl is now redundant and causes duplicate definitions.

This function and its local PR_PATH_RE constant (line 654) appear to be superseded by parseGitHubTrackingUrl at lines 1346-1431. Having two validation functions with different regex patterns creates:

  1. Inconsistent validation (line 654 allows /pull/123/files, line 1323 does not; line 1323 is case-insensitive, line 654 is not)
  2. Test extraction failures (test finds the wrong const PR_PATH_RE = occurrence)
  3. Dead code since the caller checkPR doesn't exist

Consider removing parseGitHubPRUrl entirely since addPr() now uses parseGitHubTrackingUrl.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@public/index.html` around lines 641 - 670, Remove the redundant
parseGitHubPRUrl function and its local PR_PATH_RE constant since addPr() and
other logic now use parseGitHubTrackingUrl; delete the entire parseGitHubPRUrl
block (including the PR_PATH_RE regex) and remove any leftover references (e.g.,
the defunct checkPR) so there are no duplicate URL-parsing definitions or
conflicting regexes, and ensure tests/reference points target
parseGitHubTrackingUrl instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/handlers.py`:
- Around line 270-278: You are writing canonical_url into the DB but still using
pr_url as the unique lookup, which can create duplicate rows; instead backfill
existing rows or change dedupe to the stable composite key and keep
canonical_url as metadata. Update upsert_pr to perform uniqueness/upsert by
(owner/repo/pr_number) rather than pr_url, add a migration/backfill that
populates canonical_url for existing rows, and change lookups like the
db.prepare('SELECT * FROM prs WHERE pr_url = ?') call and any other exact pr_url
queries to use the composite key (parsed['owner'], parsed['repo'],
parsed['pr_number']) or a normalized URL function; leave canonical_pr_url stored
but do not rely on it as the primary dedupe key until after migration.

In `@src/utils.py`:
- Around line 14-25: The owner/repo capture classes in _GITHUB_PR_RE and
_GITHUB_REPO_RE are too permissive and can include query/hash/space chars;
update those two patterns to use [^/?#\s]+ instead of [^/]+ so owner and repo
segments stop matching ?, # or whitespace. Do not change _GITHUB_ORG_RE to the
broad [^/?#\s]+—leave its restricted class (or explicitly keep/replace with the
safe [A-Za-z0-9_.-]+ form) so organization names remain limited to valid
characters.

In `@test-url-validation.js`:
- Around line 26-30: The extractBlock call is matching the first occurrence of
"const PR_PATH_RE =" (inside parseGitHubPRUrl) and extracting partial function
bodies which lead to top-level return errors; update the extraction to use a
unique, module-scope start marker (for example the declaration that includes the
/i flag) so extractBlock(html, '<unique module-scope marker for PR_PATH_RE with
/i>', 'function showSectionMessage(') targets the module-level PR_PATH_RE
instead of the function-local one; locate and modify the extractBlock invocation
and its start marker in test-url-validation.js (symbols: extractBlock,
PR_PATH_RE, parseGitHubPRUrl, showSectionMessage) so the extraction begins at
the module-scope regex declaration.

---

Duplicate comments:
In `@public/index.html`:
- Around line 672-686: The snippet registers a redundant click handler on
addPrBtn that uses parseGitHubPRUrl and calls an undefined checkPR(raw), causing
a runtime ReferenceError and duplicate behavior; delete the entire handler that
references parseGitHubPRUrl, checkPR, prUrlInput and url-error (the block
attaching the click listener to getElementById('addPrBtn') shown in the diff) so
only the correct handler that calls addPr() remains (the handler around addPrBtn
at line ~3518), ensuring validation and submission flow use
parseGitHubTrackingUrl/addPr.

---

Nitpick comments:
In `@public/index.html`:
- Around line 641-670: Remove the redundant parseGitHubPRUrl function and its
local PR_PATH_RE constant since addPr() and other logic now use
parseGitHubTrackingUrl; delete the entire parseGitHubPRUrl block (including the
PR_PATH_RE regex) and remove any leftover references (e.g., the defunct checkPR)
so there are no duplicate URL-parsing definitions or conflicting regexes, and
ensure tests/reference points target parseGitHubTrackingUrl instead.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 4d58b63e-67b5-4be5-b19a-bee2e76301bc

📥 Commits

Reviewing files that changed from the base of the PR and between e8c0de1 and b23fc17.

📒 Files selected for processing (5)
  • package.json
  • public/index.html
  • src/handlers.py
  • src/utils.py
  • test-url-validation.js
✅ Files skipped from review due to trivial changes (1)
  • package.json

@github-actions github-actions bot added files-changed: 8 PR changes 8 files and removed files-changed: 5 PR changes 5 files labels Mar 25, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@migrations/0004_add_canonical_url_and_composite_uniqueness.sql`:
- Around line 15-20: The DELETE currently keeps the row with MAX(id) which can
remove the freshest metadata; instead, change the logic to retain the row per
(repo_owner, repo_name, pr_number) chosen by latest readiness_computed_at (use
NULLS LAST), then latest updated_at as a secondary key, and only use id as the
final tie-breaker — implement this by computing a ROW_NUMBER() over (PARTITION
BY repo_owner, repo_name, pr_number ORDER BY readiness_computed_at DESC NULLS
LAST, updated_at DESC NULLS LAST, id DESC) and delete rows where row_number > 1
so the freshest row (by readiness_computed_at/updated_at, id tie-breaker)
survives.
- Around line 15-23: The deduplication DELETE and unique index in
migrations/0004_add_canonical_url_and_composite_uniqueness.sql are
case-sensitive because repo_owner and repo_name lack COLLATE NOCASE, so
mixed-case variants slip through; fix by either updating the schema to enforce
case-insensitive identity (add COLLATE NOCASE to repo_owner and repo_name
columns and recreate the GROUP BY and idx_prs_identity_unique to use NOCASE) or
normalize values to lowercase before persisting (in parse_pr_url() and/or
upsert_pr() ensure repo_owner and repo_name are normalized to lowercase), then
re-run the dedupe SQL using LOWER(...) or COLLATE NOCASE to consolidate existing
duplicates.

In `@public/index.html`:
- Around line 1274-1279: The repo-path branch currently special-cases the second
path segment and misclassifies GitHub pages; update the validation to check only
the first segment against RESERVED_OWNER_NAMES (the RESERVED_OWNER_NAMES Set)
and treat any non-empty second segment as a valid repo name, i.e. accept
two-segment paths when the first segment is not in RESERVED_OWNER_NAMES rather
than blacklisting specific second-segment values—modify the repo-path branch
logic to use RESERVED_OWNER_NAMES.has(firstSegment) and ensure secondSegment is
present/non-empty.
- Line 571: Make the `#url-error` element a proper live region and associate it
with the input so screen-readers announce validation failures: give `#url-error`
an aria-live (e.g., "polite") and role (e.g., "status"), ensure setUrlError()
removes the "hidden" class and sets
`#prUrlInput.setAttribute`('aria-invalid','true') and
`#prUrlInput.setAttribute`('aria-describedby','url-error') when showing an error,
and when clearing the error hide `#url-error` and reset `#prUrlInput` by removing
aria-describedby and setting aria-invalid to "false" (apply the same changes to
the other error block at lines ~1281-1290).
- Around line 3298-3300: The code canonicalizes the input into canonicalInputUrl
but later comparisons still use the raw query-string URL, causing legacy ?pr=
links to miss matches; update the duplicate-check and highlighted-row compare
logic to use parsedUrl.canonicalUrl (canonicalInputUrl) instead of the raw
query-string URL variable so tracked rows are compared against the canonical
form (apply this change to the compare/duplicate-check locations referenced
around the existing comparison blocks at ~3353-3354 and ~3807-3815); ensure any
assignment, lookup, or equality check that currently uses the raw input URL now
uses canonicalInputUrl.
- Around line 1295-1297: The validation message for empty input is misleading —
update the return reason when value is falsy in the validation block that
currently returns { valid: false, reason: 'Please enter a PR URL' } so it
reflects accepted formats (PR, repository, or organization URL); change the
reason to a more generic prompt like "Please enter a PR, repository, or
organization URL" in that same validation branch where value is checked.

In `@src/utils.py`:
- Around line 63-65: The inline guard in parse_repo_url currently uses a
one-line if with a return, causing Ruff E701; change the guard to a multi-line
if statement by splitting it into two lines (use "if not url:" on its own line
followed by an indented "return None") inside the parse_repo_url function so the
single-line multiple-statement form is removed.
- Around line 69-72: The extra special-case guard that checks repo.lower() ==
'pull' should be removed because _GITHUB_REPO_RE already excludes PR-style URLs
and this guard wrongly rejects real repos named "pull"; modify the function that
extracts owner and repo (the code using match.group(1) and match.group(2) and
referencing repo) to simply return the owner and repo without the if
repo.lower() == 'pull' early return, ensuring _GITHUB_REPO_RE remains the
validation point.

In `@test-data-display.js`:
- Around line 715-719: The current logic treats SKIP_RUNTIME_AUTH_TEST as a
pass; update the branch that checks String(error.message ||
'').includes('SKIP_RUNTIME_AUTH_TEST:') so it does NOT call testResult with a
true pass flag — instead report it as skipped (e.g., call testResult with a
distinct skipped status or false plus a skipped marker in the message) so the
runtime auth setup is not counted as passed; change the call to testResult in
that branch to use a skipped status/message (retain `Skipped: ${error.message}`)
and ensure the rest of the code that aggregates results recognizes the skipped
status.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: e6a1a3a9-4007-4073-8cd6-b25470905f4e

📥 Commits

Reviewing files that changed from the base of the PR and between b23fc17 and f08457d.

📒 Files selected for processing (7)
  • migrations/0004_add_canonical_url_and_composite_uniqueness.sql
  • public/index.html
  • src/database.py
  • src/handlers.py
  • src/utils.py
  • test-data-display.js
  • test-url-validation.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • test-url-validation.js

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
src/handlers.py (1)

277-278: ⚠️ Potential issue | 🟠 Major

Use NOCASE for the identity lookups too.

The new identity key is case-insensitive, but these reads are still plain = comparisons. If a PR is stored once as OWASP-BLT/BLT-Leaf#262 and later comes in as owasp-blt/blt-leaf#262, the write path can still treat it as the same PR while these follow-up queries miss the tracked row, which breaks auto-analysis and webhook refresh handling.

🛠️ Apply the same predicate to every identity lookup
- 'SELECT * FROM prs WHERE repo_owner = ? AND repo_name = ? AND pr_number = ?'
+ 'SELECT * FROM prs WHERE repo_owner COLLATE NOCASE = ? AND repo_name COLLATE NOCASE = ? AND pr_number = ?'

Also applies to: 1122-1123, 1148-1149, 1307-1308

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/handlers.py` around lines 277 - 278, The SELECT identity lookups use
case-sensitive '=' comparisons; change the WHERE clauses to perform
case-insensitive matching by adding COLLATE NOCASE (e.g. update the query string
'SELECT * FROM prs WHERE repo_owner = ? AND repo_name = ? AND pr_number = ?' to
use 'repo_owner = ? COLLATE NOCASE AND repo_name = ? COLLATE NOCASE AND
pr_number = ? COLLATE NOCASE') and keep the same bind parameters
(parsed['owner'], parsed['repo'], parsed['pr_number']); apply the same change to
the other occurrences referenced (the similar queries around the other
locations) so every identity lookup uses NOCASE.
test-data-display.js (1)

103-110: ⚠️ Potential issue | 🟠 Major

child.killed does not mean the child has exited.

In Node, child.killed flips to true once kill() successfully sends SIGTERM; it does not mean the process is gone. That makes the SIGKILL fallback unreachable on Unix, so a stuck wrangler can still leak and make later runs flaky.

🛠️ Minimal fix
   await sleep(800);
-  if (!child.killed) {
+  if (child.exitCode === null && child.signalCode === null) {
     try {
       child.kill('SIGKILL');
     } catch (_) {
       // Ignore cleanup failures while shutting down test runtime processes.
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test-data-display.js` around lines 103 - 110, The current logic assumes
child.killed means the process exited, but it only indicates a kill signal was
sent; instead, send a graceful signal (SIGTERM), then wait for the child to
actually exit by awaiting an event or checking exitCode within a short timeout
(e.g., listen for the child's 'exit' or 'close' event or poll child.exitCode for
up to ~800ms), and only if the process hasn't exited after that timeout send
SIGKILL (preserving the existing try/catch around kill calls). Refer to the
child variable and the current sleep(800)/kill('SIGKILL') logic and replace it
with: send SIGTERM, await exit with a Promise/timeout, then send SIGKILL as the
fallback.
🧹 Nitpick comments (1)
public/index.html (1)

1281-1300: Error state management looks good.

The function properly handles both displaying and clearing validation errors with appropriate ARIA attributes.

Minor nit: Line 1296 sets aria-invalid to "false" rather than removing the attribute. While valid, removing the attribute when the input is valid is slightly cleaner since aria-invalid="false" is the implicit default.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@public/index.html` around lines 1281 - 1300, In setUrlError, when clearing
the error (message falsy) remove the aria-invalid attribute from the input
instead of setting it to "false": keep the existing behavior of setting
aria-invalid="true" and aria-describedby when message is truthy, but in the else
branch call input.removeAttribute('aria-invalid') (and continue removing
aria-describedby) so the attribute is absent when the input is valid; reference
function name setUrlError and elements with ids 'url-error' and 'prUrlInput'.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@public/index.html`:
- Around line 3319-3329: The first validation uses an unreachable condition
because effectiveAddAll is computed as addAll || parsedUrl.type !== 'pr';
replace that logic by removing effectiveAddAll and directly checking the
intended cases: if (!addAll && parsedUrl.type !== 'pr') { setUrlError('Use a
pull request URL when bulk import is not enabled'); input.focus(); return; } and
keep the existing addAll && parsedUrl.type === 'pr' branch as-is; update
references to effectiveAddAll (if any) and ensure you use addAll and
parsedUrl.type in the conditions and call setUrlError and input.focus in both
branches.

In `@src/utils.py`:
- Around line 18-25: The repo vs org URL regexes are overlapping so
parse_repo_url() accepts "/orgs/<owner>" as a repo; update the patterns used by
_GITHUB_REPO_RE and _GITHUB_ORG_RE (and the other duplicated regex blocks around
the other ranges) so that repo regex does not match paths that begin with
"orgs/" and org regex accepts both bare owner URLs and "/orgs/<owner>" forms;
then ensure parse_repo_url() and parse_org_url() use the corrected
_GITHUB_REPO_RE and _GITHUB_ORG_RE (and adjust handle_add_pr() routing if
needed) so organization bulk URLs are routed to parse_org_url() rather than to
the repo path.

---

Duplicate comments:
In `@src/handlers.py`:
- Around line 277-278: The SELECT identity lookups use case-sensitive '='
comparisons; change the WHERE clauses to perform case-insensitive matching by
adding COLLATE NOCASE (e.g. update the query string 'SELECT * FROM prs WHERE
repo_owner = ? AND repo_name = ? AND pr_number = ?' to use 'repo_owner = ?
COLLATE NOCASE AND repo_name = ? COLLATE NOCASE AND pr_number = ? COLLATE
NOCASE') and keep the same bind parameters (parsed['owner'], parsed['repo'],
parsed['pr_number']); apply the same change to the other occurrences referenced
(the similar queries around the other locations) so every identity lookup uses
NOCASE.

In `@test-data-display.js`:
- Around line 103-110: The current logic assumes child.killed means the process
exited, but it only indicates a kill signal was sent; instead, send a graceful
signal (SIGTERM), then wait for the child to actually exit by awaiting an event
or checking exitCode within a short timeout (e.g., listen for the child's 'exit'
or 'close' event or poll child.exitCode for up to ~800ms), and only if the
process hasn't exited after that timeout send SIGKILL (preserving the existing
try/catch around kill calls). Refer to the child variable and the current
sleep(800)/kill('SIGKILL') logic and replace it with: send SIGTERM, await exit
with a Promise/timeout, then send SIGKILL as the fallback.

---

Nitpick comments:
In `@public/index.html`:
- Around line 1281-1300: In setUrlError, when clearing the error (message falsy)
remove the aria-invalid attribute from the input instead of setting it to
"false": keep the existing behavior of setting aria-invalid="true" and
aria-describedby when message is truthy, but in the else branch call
input.removeAttribute('aria-invalid') (and continue removing aria-describedby)
so the attribute is absent when the input is valid; reference function name
setUrlError and elements with ids 'url-error' and 'prUrlInput'.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: d9e1cc32-5468-4232-aaeb-107299418ddc

📥 Commits

Reviewing files that changed from the base of the PR and between 5fba838 and 24413dd.

📒 Files selected for processing (8)
  • migrations/0004_add_canonical_url_and_composite_uniqueness.sql
  • package.json
  • public/index.html
  • src/database.py
  • src/handlers.py
  • src/utils.py
  • test-data-display.js
  • test-url-validation.js

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check current PR review status, CI checks, and unresolved comments
gh pr view 262 --repo OWASP-BLT/BLT-Leaf --json reviews,statusCheckRollup,comments \
  | jq '{
    latestReviews: [.reviews[] | {author: .author.login, state: .state, submittedAt: .submittedAt}] | sort_by(.submittedAt) | reverse | .[0:5],
    checks: [.statusCheckRollup[] | {name: .name, status: .status, conclusion: .conclusion}]
  }'

# Count unresolved review threads
gh api repos/OWASP-BLT/BLT-Leaf/pulls/262/comments \
  --jq '[.[] | select(.position != null or .original_position != null)] | length'

# Get unresolved inline comments summary
gh api repos/OWASP-BLT/BLT-Leaf/pulls/262/comments \
  --jq '[.[] | {id: .id, user: .user.login, body: .body[0:120], path: .path, line: .line}]'

Length of output: 5243


@S3DFX-CYBERNot yet mergeable. Here's the current status:

CI checks: ✅ All passing.

Review status:Changes Requested (most recent review: 2026-03-25T17:14:31Z). There are now 18 unresolved inline comments — the re-review after the author's "all fixed" claim actually surfaced additional issues. Summary:

Severity Count Representative Issues
🔴 Critical 1 test-url-validation.js Line 30 — test sandbox correctness issue
🟠 Major ~11 Inline error stays hidden (display:none hardcode); valid bulk-import inputs rejected; validation placed outside addPr(); canonical_url migration safety; wrong duplicate-survival strategy (MAX(id)); GitHub Pages URL misclassification; missing ARIA attributes; org bulk URL not supported in backend parser; test-data-display.js Line 142 issue
🟡 Minor ~5 Empty-input copy mismatch; legacy ?pr= canonicalization gap; Ruff E701 on src/utils.py Line 65; dead code in public/index.html Line 3329; skipped test counted as passed

The author should work through all open inline comments on the Files changed tab before requesting another review.

@ojaswa072
Copy link
Copy Markdown
Contributor Author

@coderabbitai What are the blocking issues in the PR

@owasp-blt
Copy link
Copy Markdown

owasp-blt bot commented Apr 1, 2026

🎉 @ojaswa072 You referred a new contributor! Total referrals: 1. Current rank: #3.

Monthly Referral Leaderboard (your neighborhood)

Rank Contributor Referrals
1 @DonnieBLT 1
2 @Jayant2908 1
3 @ojaswa072 ← you 1
4 @swaparup36 1

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get current review state and latest review
gh pr view 262 --repo OWASP-BLT/BLT-Leaf \
  --json reviews,statusCheckRollup \
  | jq '{
      latestReviews: [.reviews[] | {author: .author.login, state: .state, submittedAt: .submittedAt}] | sort_by(.submittedAt) | reverse | .[0:5],
      checks: [.statusCheckRollup[] | {name: .name, status: .status, conclusion: .conclusion}]
    }'

# Get all open inline review comments (full body, file, line)
gh api repos/OWASP-BLT/BLT-Leaf/pulls/262/comments \
  --jq '[.[] | {id: .id, user: .user.login, path: .path, line: .line, original_line: .original_line, body: .body}]'

Length of output: 55951


@ojaswa072 — Good news: the latest review I submitted is ✅ APPROVED (2026-03-25T17:20:32Z) and all CI checks are passing. However, there are still 4 unresolved inline comments left open. Here's a precise breakdown:


✅ Already Resolved (14/18 comments addressed)

Great work — the vast majority of issues have been fixed across commits b23fc17, f08457d, 9ec3961, and 24413dd.


🔓 Still Open (4 comments)

🟠 Major — public/index.html Line 686

Validation placed outside addPr() — keyboard submission bypasses it
A separate click listener was added, but addPr() is also called on Enter (Line 3397). The early return in the click handler doesn't stop the keyboard path, so pressing Enter still submits invalid URLs without validation.

Fix: Move the parseGitHubTrackingUrl validation guard inside addPr() and remove the standalone click listener.


🟠 Major — src/handlers.py Line 278

canonical_url upsert can create duplicate rows
upsert_pr is writing parsed['canonical_url'] but the conflict key is still pr_url. A PR already stored with a trailing slash or different casing won't conflict, resulting in duplicate rows and missed lookups.

Fix: Change the upsert conflict target to the composite key (repo_owner, repo_name, pr_number) and ensure all lookups use the same composite key instead of pr_url string equality.


🟠 Major — src/utils.py Line 25

/orgs/<owner> bulk URLs misrouted to parse_repo_url()
_GITHUB_REPO_RE matches two-segment paths, so https://github.com/orgs/MyOrg is parsed as owner=orgs, repo=MyOrg — routed to repo import instead of org scanning.

Fix: Add a guard in parse_repo_url() to return None when owner.lower() == 'orgs', and update parse_org_url() to also recognize the /orgs/<owner> URL form.


🟡 Minor — public/index.html Line 3329

Dead/unreachable code: first validation branch can never execute

const effectiveAddAll = addAll || parsedUrl.type !== 'pr';
if (!effectiveAddAll && parsedUrl.type !== 'pr') { // ← logically impossible

!effectiveAddAll implies parsedUrl.type === 'pr', so the && parsedUrl.type !== 'pr' condition is a contradiction. This is dead code that confuses maintainers.

Fix: Remove the first if block and replace with a direct check on addAll and parsedUrl.type.


Summary

# Severity File Issue
1 🟠 Major public/index.html L686 Validation bypassed via keyboard (Enter key)
2 🟠 Major src/handlers.py L278 Duplicate PR rows due to wrong upsert conflict key
3 🟠 Major src/utils.py L25 /orgs/<owner> bulk URLs misrouted to repo parser
4 🟡 Minor public/index.html L3329 Unreachable dead code in validation logic

Address these 4 items and the PR should be ready to merge. The 3 Major ones are the true blockers.

@ojaswa072
Copy link
Copy Markdown
Contributor Author

Resolved

@SatishKumar620
Copy link
Copy Markdown

@coderabbitai Please assess the PR's completeness and merge readiness.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check current review state and CI checks
gh pr view 262 --repo OWASP-BLT/BLT-Leaf \
  --json reviews,statusCheckRollup,mergeable,mergeStateStatus \
  | jq '{
      latestReviews: [.reviews[] | {author: .author.login, state: .state, submittedAt: .submittedAt}] | sort_by(.submittedAt) | reverse | .[0:5],
      checks: [.statusCheckRollup[] | {name: .name, status: .status, conclusion: .conclusion}],
      mergeable: .mergeable,
      mergeStateStatus: .mergeStateStatus
    }'

# Get all current inline review comments
gh api repos/OWASP-BLT/BLT-Leaf/pulls/262/comments \
  --jq '[.[] | {id: .id, user: .user.login, path: .path, line: .line, body: .body[0:200]}]'

Length of output: 6816


@SatishKumar620 — Here is the current merge-readiness assessment for PR #262:


🔍 PR Status Summary

Signal Status
GitHub mergeability MERGEABLE / CLEAN
CI checks ✅ All passing (validate, Setup Project and Test Data Display, add_files_changed_label)
Latest bot review ✅ APPROVED (2026-03-25T17:20:32Z)
Inline comment threads ⚠️ 18 open (not formally resolved via GitHub UI)

⚠️ Caution: Unresolved Inline Threads

While GitHub marks the PR as clean and my latest review is an approval, 18 inline comment threads have not been formally closed (via GitHub's "Resolve conversation"). Of these, 8 still point at live code positions, meaning the underlying code may not have changed:

Severity File Line Issue
🔴 Critical test-url-validation.js L30 Test sandbox correctness — duplicate PR_PATH_RE definitions may cause the sandbox to test the wrong regex
🟠 Major public/index.html L1279 parseGitHubTrackingUrl misclassifies GitHub Pages/reserved paths (e.g. /settings/profile) as repo URLs
🟠 Major test-data-display.js L163 Windows wrapper spawn issue in process handling
🟠 Major src/utils.py L25 /orgs/<owner> bulk URLs still misrouted to parse_repo_url() as owner=orgs, repo=<owner>
🟡 Minor src/utils.py L74 Dead repo == "pull" special-case guard
🟡 Minor public/index.html L1330 Empty-input error copy still says "Please enter a PR URL" despite repo/org URLs now being valid
🟡 Minor public/index.html L3334 Legacy ?pr= flow doesn't canonicalize URL before comparing tracked rows
🟡 Minor test-data-display.js L787 Skipped runtime auth test still recorded as a passed test

The 10 comments at line: null are on outdated diff positions, suggesting those areas of code were modified — those are likely addressed.


📋 Recommendation

The PR can technically be merged (GitHub shows CLEAN + APPROVED), but before doing so I'd recommend:

  1. Verify the Critical issue at test-url-validation.js L30 — a broken test sandbox means the URL validation tests may pass even if the logic is wrong.
  2. Verify the src/utils.py L25 org URL misrouting — this is a functional bug where bulk-import of org URLs would silently be treated as repo imports.
  3. Confirm public/index.html L1279 — reserved GitHub paths being treated as valid repo URLs could mislead users.

The author's last reply was a one-word "Resolved" with no accompanying commit reference. If those 4 issues were genuinely fixed in prior commits, the maintainer can proceed to merge. If not, the author should address them first.

Copy link
Copy Markdown

@SatishKumar620 SatishKumar620 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve the flag issues by CR

@ojaswa072
Copy link
Copy Markdown
Contributor Author

@SatishKumar620 can you please state what exactly to resolve

@owasp-blt
Copy link
Copy Markdown

owasp-blt bot commented Apr 1, 2026

🎉 @ojaswa072 You referred a new contributor! Total referrals: 2. Current rank: #1.

Monthly Referral Leaderboard (your neighborhood)

Rank Contributor Referrals
1 @ojaswa072 ← you 2
2 @DonnieBLT 1
3 @Jayant2908 1

@SatishKumar620
Copy link
Copy Markdown

@ojaswa072 Test sandbox correctness — duplicate PR_PATH_RE definitions may cause the sandbox to test the wrong regex

@owasp-blt
Copy link
Copy Markdown

owasp-blt bot commented Apr 1, 2026

🎉 @SatishKumar620 You referred a new contributor! Total referrals: 2. Current rank: #2.

Monthly Referral Leaderboard (your neighborhood)

Rank Contributor Referrals
1 @ojaswa072 2
2 @SatishKumar620 ← you 2
3 @DonnieBLT 1
4 @Jayant2908 1

@ojaswa072
Copy link
Copy Markdown
Contributor Author

test-url-validation.js L30 Test sandbox correctness — duplicate PR_PATH_RE definitions may cause the sandbox to test the wrong regex

@SatishKumar620 Are you Confirm it really exist

@SatishKumar620
Copy link
Copy Markdown

@coderabbitai Please assess the PR's completeness and merge readiness.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get current review state, CI checks, and mergeability
gh pr view 262 --repo OWASP-BLT/BLT-Leaf \
  --json reviews,statusCheckRollup,mergeable,mergeStateStatus \
  | jq '{
      latestReviews: [.reviews[] | {author: .author.login, state: .state, submittedAt: .submittedAt}] | sort_by(.submittedAt) | reverse | .[0:5],
      checks: [.statusCheckRollup[] | {name: .name, status: .status, conclusion: .conclusion}],
      mergeable: .mergeable,
      mergeStateStatus: .mergeStateStatus
    }'

# Get all current inline review comments (full body, file, line)
gh api repos/OWASP-BLT/BLT-Leaf/pulls/262/comments \
  --jq '[.[] | {id: .id, user: .user.login, path: .path, line: .line, original_line: .original_line, body: .body}]'

Length of output: 56018


@SatishKumar620 — Here is the current merge-readiness assessment for PR #262:


🔍 PR Status Summary

Signal Status
GitHub mergeability MERGEABLE / CLEAN
CI checks ✅ All passing (validate, Setup Project and Test Data Display, add_files_changed_label)
Latest human review CHANGES_REQUESTED by @SatishKumar620 (2026-04-01T18:54:35Z — today)
Latest bot review ✅ APPROVED (2026-03-25T17:20:32Z)
Open inline threads ⚠️ 4 unresolved

✅ Progress: 14 of 18 Comments Addressed

Good progress across commits b23fc17, f08457d, 9ec3961, and 79911b1–24413dd. The following have been confirmed fixed:

  • Inline error visibility (display:none hardcode)
  • Valid bulk-import inputs being rejected
  • Test sandbox PR_PATH_RE extraction (wrong occurrence)
  • Migration MAX(id) survival strategy → ROW_NUMBER() by readiness_computed_at
  • COLLATE NOCASE on repo_owner/repo_name
  • ARIA live-region for assistive tech
  • GitHub Pages misclassification in parseGitHubTrackingUrl
  • Empty-input copy mismatch
  • Legacy ?pr= canonicalization gap
  • Ruff E701 on src/utils.py
  • repo == "pull" dead guard
  • Skipped test counted as passed
  • Windows wrangler process tree teardown

🔓 Still Open (4 comments — PR is NOT ready to merge)

# Severity File Issue
1 🟠 Major public/index.html L686 Validation in standalone click handler — keyboard (Enter) path in addPr() still bypasses it
2 🟠 Major src/handlers.py L278 upsert_pr writes canonical_url but conflict key is still pr_url; mismatched-case or trailing-slash variants create duplicate rows
3 🟠 Major src/utils.py L25 /orgs/<owner> bulk URLs matched by _GITHUB_REPO_RE as owner=orgs, repo=<owner> — silently routed to repo import instead of org scan
4 🟡 Minor public/index.html L3329 Dead/unreachable if (!effectiveAddAll && parsedUrl.type !== 'pr') branch — logically impossible condition

The 3 Major issues are the true blockers. @ojaswa072 should address all 4 before requesting another review.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

Let me inspect the current state of the file directly.


🏁 Script executed:

#!/bin/bash
# Check the current content of test-url-validation.js on the PR branch
gh api repos/OWASP-BLT/BLT-Leaf/contents/test-url-validation.js \
  --jq '.content' | base64 --decode

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Fetch test-url-validation.js from the PR branch
gh api "repos/OWASP-BLT/BLT-Leaf/contents/test-url-validation.js?ref=fix/pr-url-input-validation" \
  --jq '.content' | base64 --decode

Length of output: 92


🏁 Script executed:

#!/bin/bash
# Get exact branch info from PR
gh pr view 262 --repo OWASP-BLT/BLT-Leaf --json headRefName,headRefOid,headRepository \
  | jq '{headRefName, headRefOid, headRepository: .headRepository.name}'

Length of output: 142


🏁 Script executed:

#!/bin/bash
# Fetch file using HEAD commit SHA
gh api "repos/OWASP-BLT/BLT-Leaf/contents/test-url-validation.js?ref=0d3324daaa50a8210359f90f1571beaadb20b312" \
  --jq '.content' | base64 --decode

echo "---FILE END---"

Length of output: 3584


@SatishKumar620 — The duplicate PR_PATH_RE issue has been fixed in the latest commits. Here's what the current test-url-validation.js shows:

The extractBlock() helper uses the full, exact regex literal as its start token:

const parserBlock = extractBlock(
  html,
  'const PR_PATH_RE = /^\\/([^/]+)\\/([^/]+)\\/pull\\/(\\d+)\\/?$/i;',
  'function showSectionMessage('
);

This is precise enough to uniquely locate the parser block in index.html. The sandbox also includes an explicit guard:

if (typeof parseGitHubTrackingUrl !== 'function') {
  throw new Error('parseGitHubTrackingUrl function not found in page script');
}

So if the extraction were wrong and parseGitHubTrackingUrl were not captured, the test would fail loudly rather than silently pass. This was also confirmed resolved in my APPROVED review from 2026-03-25.

The 4 remaining blockers for this PR are separate issues — none of them are the PR_PATH_RE sandbox concern:

# Severity File Issue
1 🟠 Major public/index.html L686 Keyboard Enter path in addPr() bypasses validation
2 🟠 Major src/handlers.py L278 upsert_pr conflict key mismatch → duplicate rows
3 🟠 Major src/utils.py L25 /orgs/<owner> URLs misrouted to repo parser
4 🟡 Minor public/index.html L3329 Dead/unreachable if branch

@SatishKumar620
Copy link
Copy Markdown

@ojaswa072 It has been resolved and no longer exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants