chore: reduce FPs in whitespace PR by considering ; statement #1186

art1f1c3R · 2025-09-26T00:13:46Z

Summary

Many false positives were being triggered from the introduction of the excessive whitespace obfuscation rule in #1086. This is due to a lack of specificity in the rule. This PR resolves that change by considering the key syntax of using ; that is the primary malicious indicator for this method of obfuscation.

Description of changes

The rule originally considered any amount of excessive spacing (50 or more) before encountering code. Whitespaces here includes newline characters, and with Semgrep running regex pattern matching in multiline matching mode, this would trigger against code lines where a long line of code was broken across multiple lines like:

<indentation>                                             foo(arg1, another_foo(arg_2), 
                                                                                   arg3_on_other_line)

Due to differences in formatting in code files, it would sometimes also simply detect vertical spacing between lines.

The key malicious indicator white rules aims to detect is when a benign (syntactically valid) code statement is used at the start of the code line, and then the ; character is used to finish that statement, and start a new, malicious one out of the view of the general IDE (unless wrapped text was turned on, though this is often off by default). This is syntactically valid when there is excessive spacing:

After the benign code statement, before inserting a ; and then writing a malicious statement
After the benign code statement, before inserting a ;, then further excessive spacing and a malicious statement
After a ; inserted immediately after the benign code statement, and writing a malicious statement

The updated excessive_spacing.py showcases each of these examples. The new obfuscation.yaml file has been rewritten to include regex that reflects this. It has been tested on recently detected false positives logit-graph-0.1.0, kryon-ai-1.2.0, and cispark-0.1.0, as well as the integration test case django-5.0.6. It was run against the Backstabbers dataset, which confirmed it was able to detect this malicious behaviour, and on popular and trusted packages from the ICSE25-AE-Evaluation dataset, for which it did not trigger.

Checklist

I have reviewed the contribution guide.
My PR title and commits follow the Conventional Commits convention.
My commits include the "Signed-off-by" line.
I have signed my commits following the instructions provided by GitHub. Note that we run GitHub's commit verification tool to check the commit signatures. A green verified label should appear next to all of your commits on GitHub.
I have updated the relevant documentation, if applicable.
I have tested my changes and verified they work as expected.

Signed-off-by: Carl Flottmann <[email protected]>

…#1186) Reduce false positives in the whitespace semgrep rule by considering the ; statement. Signed-off-by: Carl Flottmann <[email protected]>

art1f1c3R requested review from behnazh-w and tromai as code owners September 26, 2025 00:13

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Sep 26, 2025

behnazh-w approved these changes Sep 26, 2025

View reviewed changes

art1f1c3R added 2 commits September 26, 2025 14:21

chore: reduce FPs in whitespace PR by considering ; statement

dcf43e4

Signed-off-by: Carl Flottmann <[email protected]>

docs: updated defaults and readme to clarify how to disable rules

33711dd

Signed-off-by: Carl Flottmann <[email protected]>

art1f1c3R force-pushed the art1f1c3R/whitespaces-fp-reduction branch from 1d3f4bc to 33711dd Compare September 26, 2025 04:23

behnazh-w approved these changes Sep 26, 2025

View reviewed changes

art1f1c3R merged commit 27f3cdd into main Sep 26, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: reduce FPs in whitespace PR by considering ; statement #1186

chore: reduce FPs in whitespace PR by considering ; statement #1186

Uh oh!

art1f1c3R commented Sep 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chore: reduce FPs in whitespace PR by considering ; statement #1186

chore: reduce FPs in whitespace PR by considering ; statement #1186

Uh oh!

Conversation

art1f1c3R commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Description of changes

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

art1f1c3R commented Sep 26, 2025 •

edited

Loading