-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protobuf format in input files is not user friendly #847
Comments
We currently decode the byte array provided by libFuzzer in every execution, so I am a bit worried that switching out the highly optimized binary format for the fully reflection-based text format will regress fuzz test performance. Have you tested this on a representative fuzz test? The general approach we have been following to make sense of seeds is to rely on the JUnit integration, which allows developers to inspect the fuzz test parameters as proper Java objects simply by running the fuzz test in test mode and setting a break point. We have found that inspecting input files directly creates friction for developers, even if the format is relatively straightforward. But I can see how that would be different in environments where Protobuf is used heavily and the fuzz test accepts only a single Proto parameter. |
I also haven't tested the performance and I didn't know that the encoding occurs in every iteration. This certainly wouldn't be great. Thinking out loud... the encoding during fuzzing could be different from the encoding for user-facing objects. But I understand that this may require a major change in the code structure. |
We are looking into reusing the in-memory objects if the input bytes haven't been loaded from disk. This does require patching libFuzzer though. If you can run a simple performance test on a real-world fuzz test, that could provide us with very relevant data.
I fully agree. Again this would be possible by patching libFuzzer and is certainly something we could consider. It's just that so far we found it more effective to improve the Java debugging experience, which somewhat sidesteps the question of what a human-readable input file should look like. |
It's a little hard to judge the performance requirements for all fuzz targets. libprotobuf_mutator gives users the option to mutate binary protos or text protos, and the default option is mutating text proto: https://github.com/google/libprotobuf-mutator/blob/master/src/libfuzzer/libfuzzer_macro.h#L26-L35 I'm not sure about the default option, but would it be possible for Jazzer to have both options? |
We will look into this and other ways to make the corpus entries easier to handle eventually. We are currently focusing on polishing the JUnit 5 based workflow though, so I can't say yet when we will get to this. |
Hi @hadi88 ! Did you ever get your issue with Jazzer resolved? Just need to understand in detail what you are trying to achieve, and we can give the best options to solve. |
Protobufs are being serialized and parsed using the Protobuf binary format. This makes it hard for users to look through input files. For example, it's almost impossible to read a crash producing input to understand the crash.
By small changes to mutation/mutator/proto/BuilderMutatorFactory.java, the write and read methods could use protobuf TextFormat utility so that input files have protos in human readable text format.
The text was updated successfully, but these errors were encountered: