Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix both branches dead from bot address in array #1233

Merged
merged 5 commits into from
Jan 24, 2024
Merged

Conversation

sim642
Copy link
Member

@sim642 sim642 commented Nov 2, 2023

Closes #1232. Closes #1188.

The fix is from #1188 (comment). I'm not sure if this is the right fix, but it is a fix.

Copy link
Member

@michael-schwarz michael-schwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving the value inside the array a super-bot value instead of a bottom value of correct type might fix this example, but is not a principled solution one would want to merge.

Such a solution should address what is read.

@sim642
Copy link
Member Author

sim642 commented Nov 2, 2023

Such a solution should address what is read.

The points-to set is empty, there is nothing to read!

The morally right thing would be to have the value that's actually there: a top address set. But some array domains don't work that way and there's nothing nicer we can do about it before the PLDI deadline.

@michael-schwarz
Copy link
Member

Maybe you can use the partitioned array domain and not this one for which it is dubious what it does for programs with undefined behavior such as this example anyway?
Or have a separate branch for your PLDI drive, instead of merging this into the master?

@sim642
Copy link
Member Author

sim642 commented Nov 2, 2023

Maybe you can use the partitioned array domain and not this one for which it is dubious what it does for programs with undefined behavior such as this example anyway?

There's nothing undefined about this. It can arise from every unknown pointer and that's simply what happens in all the programs from the issues. It's our as-sound-as-possible handling of unknown pointers that's causing us to be unsound in this case.

@sim642
Copy link
Member Author

sim642 commented Nov 2, 2023

Maybe you can use the partitioned array domain

And switching the domain is not an option because it makes the analysis more expensive. This issue arises with the large-program example config that we give and want everyone else to use.

@michael-schwarz
Copy link
Member

Then maybe what we have to do for invalidate calls is to assign a top_of_type value, even if this means never being able to recover from it. Starting from bottom is ok if the program does not exhibit undefined behavior, as all places that will be read will be written.
For invalidation it is not ok to put bottom, as even if some cells are overwritten, the old values set by the function for which we do not have the source code may still be read without the program having Undefined Behavior.
The non-partitioning array-domain is extremely dubious for code that has library functions that write into arrays for this reason, and should not be used for any programs that contain such calls to library functions.

@sim642
Copy link
Member Author

sim642 commented Nov 3, 2023

Then maybe what we have to do for invalidate calls is to assign a top_of_type value

My change is in that top_value. Not putting bottoms would mean eliminating the weird can_recover_from_top logic.

The non-partitioning array-domain is extremely dubious for code that has library functions that write into arrays for this reason, and should not be used for any programs that contain such calls to library functions.

Partitioned arrays are a single joined value just like trivial arrays if there's no partitioning. If that case of partitioned arrays is somehow more sound, we should just make trivial arrays behave the same way. Invalidation should kill any partitioning anyway, so it's not the partitioning that's making the result more sound here.

@michael-schwarz
Copy link
Member

michael-schwarz commented Nov 3, 2023

My change is in that top_value. Not putting bottoms would mean eliminating the weird can_recover_from_top logic.

Yes, we just have to make sure it's not at the expense of killing any precision for the non-partitioned array domain in other places where top_value is used but putting an initial bottom is ok (such as for initial values, as discussed before).
So doing this would require a careful review of where top_value is used, and replacing any uses where initializing with bottom is ok.
Then can_recover_from_top can be limited to computing these initial values. I would still keep it there, because otherwise we might just as well not track arrays at all, which would be bad. On the other hand, initializing those array domains that can recover from top to top allows being sound for programs with UB, so it is an important distinction to keep.

@sim642 sim642 added this to the v2.4.0 milestone Nov 15, 2023
src/cdomains/valueDomain.ml Outdated Show resolved Hide resolved
@sim642 sim642 self-assigned this Jan 9, 2024
@michael-schwarz
Copy link
Member

I'll review this in detail either tomorrow or Saturday.

Copy link
Member

@michael-schwarz michael-schwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems at first glance that top_value is indeed only called in places where it is not an initial value. I would still feel more comfortable if we could do a run on SV-COMP to confirm it has no precision impact before merging this.

@sim642
Copy link
Member Author

sim642 commented Jan 24, 2024

I did before vs after benchmarking on sv-benchmarks and there's no difference in results modulo noise.

@sim642 sim642 merged commit 2b98818 into master Jan 24, 2024
17 of 19 checks passed
@sim642 sim642 deleted the concrat-both-branches branch January 24, 2024 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Both branches dead unsoundness on Concrat Both branches dead in zstd
3 participants