Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Several Bugs in the fuzz_submodule Causing a lot of False Alarms in the OSS-Fuzz Bug Tracker #1950

Conversation

DaveLak
Copy link
Contributor

@DaveLak DaveLak commented Aug 9, 2024

Fixes the buggy fuzz_submodule harness which is the root cause of all recent OSS-Fuzz/Monorail issues opened.

There are several distinct changes introduced here, but they are all addressing the same related exception handling weaknesses in the fuzz harness code so I think they make sense in a single PR.

Commit messages should provide relevant context, however I want to explicitly mention one change that is particularly noteworthy: the introduction of a mechanism to filter shallow errors using an explicit exceptions list.

This new pattern involves generating an 'explicit-exceptions-list.txt' by scanning for 'raise' and 'assert' statements via git grep during the container build step. The list helps the fuzz harness to distinguish between expected and unexpected exceptions, significantly reducing false positives.

The changes I propose here are intentionally limited in scope for now to get feedback/test in prod (lol) before adopting this pattern wholesale. If successful, which I believe it will be, it should make more developing more interesting tests faster to do.


P.S. sorry for the delay on this!!!

DaveLak and others added 6 commits August 7, 2024 22:07
Fixes a bug in the `fuzz_submodule` harness where the fuzzed data can
produce file names that exceed the maximum size allowed byt the OS. This
issue came up previously and was fixed in gitpython-developers#1922, but the submodule file
name fixed here was missed in that PR.

Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=69456
This reduces false positive test failures by identifying and
gracefully handling exceptions that are explicitly raised by GitPython,
thus reducing the false-positive fuzzing test failure rate.
Changes:
   - `match_exception_with_traceback` uses regular expressions for more
     flexible matching of file paths and line numbers. This allows for
     partial matches and more complex patterns.

   - Improve `check_exception_against_list` by delegating to
     `match_exception_with_traceback` for checking tracebacks against
     exception list entries.

   - `load_exception_list`: Remains largely unchanged, as it correctly
     parses the file and line number from each exception entry. However,
     we ensure the set consists of regex patterns to match against
     tracebacks.
Changes:

   - Simplify exception handling in test harnesses via `handle_exception(e)`
     in the `except Exception as e:` block.

   - `setup_git_environment` is a step towards centralizing environment
     variable and logging configuration set up consistently across
     different fuzzing scripts. **Only applying it to a single test for
     now is an intentional choice in case it fails to work in the
     ClusterFuzz environment!** If it proves successful, a follow-up
     change set will be welcome.
To ensure that all necessary files are included in the
explicit-exceptions-list.txt file and unwanted files and directories are
not.
The environment setup must happen before the `git` module is imported,
otherwise GitPython won't be able to find the Git executable and raise
an exception that causes the ClusterFuzz fuzzer runs to fail.
@DaveLak DaveLak force-pushed the fix-fuzz-submodules-filename-exception branch from bf2a112 to 2ed3334 Compare August 9, 2024 05:00
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks so much!

@Byron Byron merged commit a621ff0 into gitpython-developers:main Aug 9, 2024
22 checks passed


@atheris.instrument_func
def load_exception_list(file_path: str = "explicit-exceptions-list.txt") -> Set[Tuple[str, str]]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this file isn't my most beautiful work (e.g., I'm not proud of hardcoding "explicit-exceptions-list.txt" in two places) but it worked quite well in local testing so I figure getting deployed to CLusterFuzz and validating it there was worth it for now 🤷‍♀️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly, I just checked that all changes are within the fuzzing code, but didn't review the changes themselves as these are entirely owned by you, a trusted contributor. Thus you are unlikely to hear complaints from me 😅.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly just left that comment in hopes that lightly shaming myself in public will push me to clean it up sooner than not acknowledging it at all 😅

@DaveLak DaveLak deleted the fix-fuzz-submodules-filename-exception branch August 9, 2024 07:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants