Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more EIP syncs for x86 target memory hooks #2064

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

Evian-Zhang
Copy link

Inside memory hooks, value of RIP register may be wrongly retrieved, especially for SIMD instructions (since they do not use gen_op_ld_v and gen_op_st_v for memory access).

However, this behavior is random and I cannot provide an example for this bug. I came across this bug when unicorn is used as fuzzing executor with AFL++.

BTW, you can see many usages of tcg_gen_qemu_ld_xx and tcg_gen_qemu_st_xx across the whole translate.c. Maybe we should add this EIP sync before each callsite? However, there are too many usages (in some rarely-seen instructions), and I could not tell whether we should add this sync. So this PR will focus on the most common usages.

@wtdcode
Copy link
Member

wtdcode commented Dec 12, 2024

Could you add tests to illustrate the instructions?

@Evian-Zhang
Copy link
Author

Sorry I could not provide a concrete test case, as this bug occurs randomly.

I could explain in theory. As the eip is already synced in normal memory accesses, this makes sure that the RIP we get in memory hooks is correct. However, the SIMD instructions do not use those functions to access memories. So in some rare cases where eip is not synced, we could only got the wrong eip that lastly synced.

As we already sync eip in normal memory accesses, there is no reason not adding those syncs for SIMD memory accesses.

@wtdcode
Copy link
Member

wtdcode commented Dec 12, 2024

Sorry I could not provide a concrete test case, as this bug occurs randomly.

I could explain in theory. As the eip is already synced in normal memory accesses, this makes sure that the RIP we get in memory hooks is correct. However, the SIMD instructions do not use those functions to access memories. So in some rare cases where eip is not synced, we could only got the wrong eip that lastly synced.

As we already sync eip in normal memory accesses, there is no reason not adding those syncs for SIMD memory accesses.

I understand this happens in a quite large codebase. I mean you can provide a unit case including only 1 SIMD instruction and 1 memory hook and the PC is not synced. You can find lots of samples in tests/unit/test_x86.c

@wtdcode
Copy link
Member

wtdcode commented Dec 12, 2024

This ensures our future releases won’t break your uses cases.

@Evian-Zhang
Copy link
Author

I understand the necessity to add test case. However, by saying "random", I mean the bug is not reproducible. I do meet one SIMD instruction which does not sync EIP during one execution, however, in another execution, it is synced, where somewhere else is not synced. This is because I meet this bug when I use unicorn as fuzzing engine with AFL++, which uses a lot of low-level optimizations such as inter-process TB caches, which makes this bug hard to catch and hard to reproduce in a small testcase.

@wtdcode
Copy link
Member

wtdcode commented Dec 12, 2024

I understand the necessity to add test case. However, by saying "random", I mean the bug is not reproducible. I do meet one SIMD instruction which does not sync EIP during one execution, however, in another execution, it is synced, where somewhere else is not synced. This is because I meet this bug when I use unicorn as fuzzing engine with AFL++, which uses a lot of low-level optimizations such as inter-process TB caches, which makes this bug hard to catch and hard to reproduce in a small testcase.

Are you using unicornafl?

@Evian-Zhang
Copy link
Author

Yes

@wtdcode
Copy link
Member

wtdcode commented Dec 12, 2024

Unicornafl bundled a quite old Unicorn at this moment, which I will update once next patch release lands. Have you updated the bundled unicorn version?

@Evian-Zhang
Copy link
Author

I use the latest unicorn at dev branch.

@Evian-Zhang
Copy link
Author

During these days, I have tried my best to construct a series of instructions which could minimally reproduce this bug, but still can't find one. The unsync of EIP happens very randomly, and I am not fully aware of the low-level mechanism of QEMU about why the EIP is unsynced.

However, I'm pretty sure the fix I posted here is correct and should work (and without the fix, the bug does exist), both theoretically (since gen_op_ld_v syncs the EIP, why not gen_ldo_env_A0?) and practically (after this fix, I do not meet that bug any more).

@wtdcode
Copy link
Member

wtdcode commented Dec 29, 2024

I understand the root cause here. We should sync PC before any ld/st, however, the initial port was very incomplete.

Your fix is correct indeed.

@wtdcode
Copy link
Member

wtdcode commented Dec 29, 2024

I merged the dev branch and triggered CI again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants