Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure the corpus for jest in regression mode? #637

Open
karfau opened this issue Oct 9, 2023 · 4 comments
Open

How to configure the corpus for jest in regression mode? #637

karfau opened this issue Oct 9, 2023 · 4 comments

Comments

@karfau
Copy link

karfau commented Oct 9, 2023

I read through https://github.com/CodeIntelligenceTesting/jazzer.js/blob/main/docs/jest-integration.md and am also aware about https://github.com/CodeIntelligenceTesting/jazzer.js/blob/main/packages/core/options.ts but I was not able to derive, how to configure those tests where to pick the data from.

Do they have to stay in the top level as crash-<hash> files?

I would love to point to a separate directory like when passing "corpus" as an argument, so that the test runner picks up the files contained in it and uses those for the regression test.
Is that possible?

I enabled verbose logging and even though the test is being reported as being run, what data is passed to it if no input is being found in regression mode?

Because I found this code:

static readonly defaultCorpusDirectory = ".cifuzz-corpus";

I tried to add my crash- files to the .cifuzz-corpus directory, but didn't see anything happening.
When running the tests in regression mode I observed that it creates a directory structure inside that folder, which reflects the name of the test file, the describe message and the it.fuzz/test.fuzz messages, so I also added my crash files there, but it didn't change anything.

To understand what is being passed to the targets, I added the following to my test suite:

describe('ensure previous fuzzer findings are not reintroduced', () => {
	test.fuzz('console.log', (data) => console.log(data.toString()));
});

which only leads to the following output:
image

Here is where you can see all the changes I did so far: xmldom/xmldom#556
(2 commits pushed to that branch at the point of posting this)

@oetr
Copy link
Contributor

oetr commented Oct 9, 2023

Jazzer.js jest-runner can run in two modes: regression, and fuzzing. Jazzer.js auto-generates the directories for each mode and test. In regression mode for your tests in regression.test.fuzz.js three directories are generated based on the describe blocks and test names in the test directory:

├── regression.test.fuzz
│   └── ensure_previous_fuzzer_findings_are_not_reintroduced
│       ├── console.log
│       ├── dom-parser.html.fuzz.js
│       └── dom-parser.xml.fuzz.js

Files in these three directories will be used for their corresponding tests.

In fuzzing mode, the directory is in the main project directory .cifuzz-corpus, where you put some files already (the first two will be ignored, because they are not associated with any test):

├── .cifuzz-corpus
│   ├── crash-doctype-in-element.xml
│   ├── crash-proto-prefix.xml
│   └── regression.test.fuzz
│       └── ensure_previous_fuzzer_findings_are_not_reintroduced
│           ├── console.log
│           │   ├── crash-doctype-in-element.xml
│           │   └── crash-proto-prefix.xml
│           ├── dom-parser.html.fuzz.js
│           │   ├── crash-doctype-in-element.xml
│           │   └── crash-proto-prefix.xml
│           └── dom-parser.xml.fuzz.js
│               ├── crash-doctype-in-element.xml
│               └── crash-proto-prefix.xml

The fuzzing mode also uses inputs from the regression directories of for each test, however regression mode only uses inputs from the regression directory.

Also, the first input that the fuzzer tries is an empty input (does not contain any bytes). If you have no files in the regression directory of the corresponding test, this will be the only input that the fuzzer will try. That should explain your screenshot above.

@karfau
Copy link
Author

karfau commented Oct 9, 2023

@oetr thank you for the explanation.
Sadly I still don't understand why the files I added are not considered by the regression tests.

I would have expected that the two samples are going to be logged by the test using console.log.

regression mode only uses inputs from the regression directory.

Do you mean I should add a regression sub directory to each of those directories and move the files there?

(Is there any documentation/executable example that shows this?)

@oetr
Copy link
Contributor

oetr commented Oct 9, 2023

Do you mean I should add a regression sub directory to each of those directories and move the files there?

Jazzer.js automatically generates all necessary directories for both modes (regression, fuzzing) after you run your jest fuzz tests once.

The regression tests are always in some folder relative to where the test resides.
The regression tests for your console.log test in test/regression.test.fuzz.js should go into test/regression.test.fuzz.js/ensure_previous_fuzzer_findings_are_not_reintroduced/console.log/

(Is there any documentation/executable example that shows this?)

You can check out the example in https://github.com/CodeIntelligenceTesting/jazzer.js/blob/main/examples/jest_integration/integration.fuzz.js, where the regression tests are in the subfolder ./integration.fuzz.js.

It is somewhat cryptically mentioned in https://github.com/CodeIntelligenceTesting/jazzer.js/blob/main/docs/jest-integration.md#fuzzing-mode, but I will add a ticket to improve this part of our documentation.

Thanks for reporting this!

@karfau
Copy link
Author

karfau commented Oct 9, 2023

Worked like a charm.

Here is my "generic" solution for a folder that contains a single regression testsuite and multiple fuzz targets, which reflects the directory structure that jazzer.js generates and will yell at you for targets without any input files and verifies that each target it called the expected amount of times:

'use strict';

const { describe, expect, test, beforeAll } = require('@jest/globals');
const fs = require('fs');
const path = require('path');
const TARGETS = fs.readdirSync(__dirname).filter((file) => file.endsWith('.target.js'));

TARGETS.forEach((target) => {
	describe('', () => {
		beforeAll(() => {
			const testfiles = fs.readdirSync(path.join(__filename.replace(/\.js$/, ''), target));
			expect(testfiles.length).toBeGreaterThan(0);
		});
		const module = require(path.join(__dirname, target));
		test.fuzz(target, (data) => module.fuzz(data));
	});
});

(Update: I needed to drop the assertion on the calls made to the spy, it didn't work reliably.)
Improved docs will sure help other people trying the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants