Skip to content

[NEW QUERY] Add pathlib.Path.resolve() and is_relative_to() as path injection sanitizers#21

Draft
Copilot wants to merge 3 commits intomainfrom
copilot/fix-codeql-sanitizer-handling
Draft

[NEW QUERY] Add pathlib.Path.resolve() and is_relative_to() as path injection sanitizers#21
Copilot wants to merge 3 commits intomainfrom
copilot/fix-codeql-sanitizer-handling

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 25, 2026

📝 Query Information

  • Language: Python
  • Query ID: python/detect-unsanitized-rglob-path-traversal
  • Category: Security
  • Severity: warning
  • CWE/CVE: CWE-22

🎯 Description

What This Query Detects

The standard py/path-injection query does not recognize pathlib.Path.resolve() as path normalization or pathlib.Path.is_relative_to() as a safe access check — despite their os.path equivalents (realpath, startswith) already being modeled. This causes false positives when the idiomatic pathlib sanitization pattern is used.

This PR extends the standard library's two-state flow machine (NotNormalizedNormalizedUnchecked) via its OO extension points:

  • PathlibResolveCall extends Path::PathNormalization::Range (analogous to OsPathRealpathCall in Stdlib.qll:~1065)
  • IsRelativeToCall extends Path::SafeAccessCheck::Range (analogous to StartswithCall in Stdlib.qll:~5090)

Both sanitizers must be applied together (normalize then check) for the flow to be blocked — matching the standard library's intended design.

Example Vulnerable Code

# Detected: no sanitization
def get_summary(job_id):
    job_dir = JOBS_ROOT / job_id
    matches = list(job_dir.rglob("summary.csv"))

Example Safe Code

# Not detected: resolve() + is_relative_to() breaks the taint flow
def get_summary(job_id):
    job_dir = JOBS_ROOT / job_id
    resolved = job_dir.resolve()
    if not resolved.is_relative_to(JOBS_ROOT):
        raise ValueError("path escapes root")
    matches = list(resolved.rglob("summary.csv"))

Upstream re-bundling note

CodeQL is extensible via OO here — no custom query would be needed if these two classes were added directly to semmle/python/frameworks/Stdlib.qll in the standard library. The custom query exists only because the standard py/path-injection doesn't import our extensions.

🧪 Testing

  • Positive test cases included
  • Negative test cases included
  • Edge cases covered
  • All tests pass

Seven scenarios tested: unsanitized rglob, sanitized rglob (resolve+is_relative_to), resolve-only, check-only, sanitized open, unsanitized open, and existing realpath+startswith regression.

📋 Checklist

  • Query compiles without errors
  • Documentation complete (.md and .qhelp)
  • Metadata properly set (@name, @id, @kind, etc.)
  • Tests validate query behavior
  • No false positives in test cases

🔗 References


Note: This query was developed using Test-Driven Development methodology.


📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

Copilot AI and others added 2 commits March 25, 2026 13:51
…itizers

Extend the standard py/path-injection query to recognize:
- pathlib.Path.resolve() as Path::PathNormalization::Range
- pathlib.Path.is_relative_to() as Path::SafeAccessCheck::Range

This enables the two-state flow analysis (NotNormalized → NormalizedUnchecked)
to correctly handle the resolve() + is_relative_to() sanitization pattern,
preventing false positives when both are used together.

Co-authored-by: felickz <1760475+felickz@users.noreply.github.com>
Agent-Logs-Url: https://github.com/testing-felickz/codeql-development-template/sessions/933a78be-b382-4058-bda3-08e9eac28cb0
- Rename resolveAttr to resolveMethodAccess for clarity
- Add documentation comment on the checks predicate explaining branch=true logic

Co-authored-by: felickz <1760475+felickz@users.noreply.github.com>
Agent-Logs-Url: https://github.com/testing-felickz/codeql-development-template/sessions/933a78be-b382-4058-bda3-08e9eac28cb0
Copilot AI changed the title [WIP] Fix CodeQL misunderstanding of sanitized dataflow in py-path-injection [NEW QUERY] Add pathlib.Path.resolve() and is_relative_to() as path injection sanitizers Mar 25, 2026
Copilot AI requested a review from felickz March 25, 2026 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Query Create]: CodeQL does not understand dataflow for sanitized rglob use in py-path-injection

2 participants