-
-
Notifications
You must be signed in to change notification settings - Fork 906
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Replace the suboptimal fuzz_tree harness with a better alternative
As discussed in the initial fuzzing integration PR[^1], `fuzz_tree.py`'s implementation was not ideal in terms of coverage and its reading/writing to hard-coded paths inside `/tmp` was problematic as (among other concerns), it causes intermittent crashes on ClusterFuzz[^2] when multiple workers execute the test at the same time on the same machine. The changes here replace `fuzz_tree.py` completely with a completely new `fuzz_repo.py` fuzz target which: - Uses `tempfile.TemporaryDirectory()` to safely manage tmpdir creation and tear down, including during multi-worker execution runs. - Retains the same feature coverage as `fuzz_tree.py`, but it also adds considerably more from much smaller data inputs and with less memory consumed (and it doesn't even have a seed corpus or target specific dictionary yet.) - Can likely be improved further in the future by exercising additional features of `Repo` to the harness. Because `fuzz_tree.py` was removed and `fuzz_repo.py` was not derived from it, the Apache License call outs in the docs were also updated as they only apply to the singe `fuzz_config.py` file now. [^1]: #1901 (comment) [^2]: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=68355
- Loading branch information
Showing
5 changed files
with
57 additions
and
90 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
import atheris | ||
import io | ||
import sys | ||
import os | ||
import tempfile | ||
|
||
if getattr(sys, "frozen", False) and hasattr(sys, "_MEIPASS"): | ||
path_to_bundled_git_binary = os.path.abspath(os.path.join(os.path.dirname(__file__), "git")) | ||
os.environ["GIT_PYTHON_GIT_EXECUTABLE"] = path_to_bundled_git_binary | ||
|
||
with atheris.instrument_imports(): | ||
import git | ||
|
||
|
||
def TestOneInput(data): | ||
fdp = atheris.FuzzedDataProvider(data) | ||
|
||
with tempfile.TemporaryDirectory() as temp_dir: | ||
repo = git.Repo.init(path=temp_dir) | ||
|
||
# Generate a minimal set of files based on fuzz data to minimize I/O operations. | ||
file_paths = [os.path.join(temp_dir, f"File{i}") for i in range(min(3, fdp.ConsumeIntInRange(1, 3)))] | ||
for file_path in file_paths: | ||
with open(file_path, "wb") as f: | ||
# The chosen upperbound for count of bytes we consume by writing to these | ||
# files is somewhat arbitrary and may be worth experimenting with if the | ||
# fuzzer coverage plateaus. | ||
f.write(fdp.ConsumeBytes(fdp.ConsumeIntInRange(1, 512))) | ||
|
||
repo.index.add(file_paths) | ||
repo.index.commit(fdp.ConsumeUnicodeNoSurrogates(fdp.ConsumeIntInRange(1, 80))) | ||
|
||
fuzz_tree = git.Tree(repo, git.Tree.NULL_BIN_SHA, 0, "") | ||
|
||
try: | ||
fuzz_tree._deserialize(io.BytesIO(data)) | ||
except IndexError: | ||
return -1 | ||
|
||
|
||
def main(): | ||
atheris.Setup(sys.argv, TestOneInput) | ||
atheris.Fuzz() | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file was deleted.
Oops, something went wrong.