Skip to content

[Query Create]: CodeQL does not understand dataflow for sanitized rglob use in py-path-injection #20

@felickz

Description

@felickz

CWE/CVE Reference (Optional)

"CWE-22"

Query Description

CodeQL currently flags a vulnerability in py-path-injection even when the dataflow is broken by proper sanitization layers. The query should more accurately model the santizer scenario where:

  • A character blocklist is applied to an HTTP parameter (job_id), rejecting /, \, os.sep, os.altsep, and ..
  • Path canonicalisation is performed and verified with is_relative_to(JOBS_ROOT)
    Despite these, CodeQL flags a taint flow from job_id to JOBS_ROOT/job_id to rglob(). The query should account for the mitigations (blocklist and canonicalisation+confinement check) so as not to report a false positive.

Code Examples

# SHOULD NOT be detected (correctly sanitized):
def get_summary(job_id):
    if any(x in job_id for x in ["/", "\\", os.sep, os.altsep, ".."]):
        raise ValueError("bad job_id")
    job_dir = JOBS_ROOT / job_id
    # Path confinement check
    resolved = job_dir.resolve()
    if not resolved.is_relative_to(JOBS_ROOT):
        raise ValueError("confined job_id")
    matches = list(job_dir.rglob("summary.csv"))

# Should be detected (unsanitized):
def get_summary(job_id):
    job_dir = JOBS_ROOT / job_id
    matches = list(job_dir.rglob("summary.csv"))

Query Name (Optional)

"DetectUnsanitizedRglobPathTraversal"

Query Type

"Security"

References (Optional)

"https://github.com/dsp-testing/py-path-injection/pull/2#issuecomment-4120253116, https://github.com/dsp-testing/py-path-injection/blob/main/app/sample_data/batch_001/summary.csv"

Expected Severity

"Medium"

Target Language

"python"

Code of Conduct

false

View original Slack conversation

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions