Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of announcing executed instructions #702

Open
xThaid opened this issue May 10, 2024 · 2 comments
Open

Order of announcing executed instructions #702

xThaid opened this issue May 10, 2024 · 2 comments
Labels
optimization This is *just* an optimization!

Comments

@xThaid
Copy link

xThaid commented May 10, 2024

While working on #699 I found that increasing the size of the instruction buffer causes performance loss on a benchmark.

Here's what happened. Increased size of the instruction buffer caused higher utilization of backend (in particular ROB). Thus, it is more likely that two instructions will be ready to announce at the same time and marked as done in ROB. However, currently we collect finished instructions from FUs with no specific order. And this is the reason why we get the slowdown in crc32.

Specifically, at some point two instructions are ready to be collected and the newer (with higher ROB id) is selected and marked as done in ROB. Therefore in the next cycle we cannot retire the older instruction yet (because it is not marked as done yet) and we lose one cycle. This happens every program loop iteration and in total results in many wasted cycles.
With lower backend utilization, the instruction causing the problem wouldn't be ready to execute yet and we would retire the oldest instruction as soon as possible.

Choosing the order of the instructions we want to announce doesn't seem to be a trivial task -- sounds like a scheduling problem and we don't have full information about future instructions. I think we should simply choose the oldest instruction. By doing that we are releasing the resources as soon as possible and making space for further instructions.

The only small issue is that currently we cannot reliably tell which ROB id is older - ROB id is a circular pointer. Either we add one bit to every ROB id or we do some heuristics like a < b iff (b - a) % rob_size < rob_size / 2 (which will work with the assumption that instructions with ROB ids different more than rob_size / 2 will never be ready to announce at the same time).

What do you think about this problem?

@xThaid xThaid added the bug Something isn't working label May 10, 2024
@Kristopher38
Copy link
Member

You can reliably tell which rob id is older if you know the start and end indices of its circular buffer. A nonexclusive method to get them, get_indices, already exists in the rob.

@tilk
Copy link
Member

tilk commented May 13, 2024

Two thoughts:

  • This might be partially related to the RS select order - a bad order of selecting instructions from RS can impact performance.
  • Announcement needs to be reworked later for superscalarity support. Superscalarity might fix this performance issue in some cases.

All in all, I think this shouldn't be our priority now.

@tilk tilk added optimization This is *just* an optimization! and removed bug Something isn't working labels May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization This is *just* an optimization!
Projects
None yet
Development

No branches or pull requests

3 participants