-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Refactoring the RISCV architecture to Auto-Sync on LLVM #2756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
arch/RISCV/RISCVInstPrinter.c
Outdated
SStream_concat0(markup(O, Markup_Register), "x18"); | ||
} | ||
break; | ||
case RISCVZC_RLISTENCODE::RA_S0_S3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These namespace specifiers should have been replaced by AS. This might be a bug.
arch/RISCV/RISCVInstPrinter.c
Outdated
|
||
if (PrintBranchImmAsAddress) { | ||
uint64_t Target = Address + MCOperand_getImm(MO); | ||
if (!STI.hasFeature(RISCV_Feature64Bit)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The STI.hasFeature
should probably be automated away as well.
You can check the STIFeatureBits
patch and just copy and adapt it if you have some time.
arch/RISCV/RISCVInstPrinter.c
Outdated
if (SysReg && SysReg->haveRequiredFeatures(STI.getFeatureBits())) | ||
SStream_concat0(markup(O, Markup_Register), SysReg->Name); | ||
else | ||
SStream_concat0(markup(O, Markup_Register), formatImm(Imm)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, better get rid of these formatImm
calls. Otherwise we have two places to take care of formatting and everything gets less consistent.
// which will not print trailing zeros and will use scientific notation | ||
// if it is shorter than printing as a decimal. The smallest value requires | ||
// 12 digits of precision including the decimal. | ||
if (FPVal == (int)(FPVal)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need to check the standard, but casting an float to an integer just rounds it up/down.
Please just use the SStream function for printing it.
cc @hainest You are probably aware of this PR, just wanted to ping you as well. Because you mentioned in the Zydis discussion you use Capstone for RISCV as well. |
suite/cstest/include/test_mapping.h
Outdated
{ .str = "riscv32", .val = CS_MODE_RISCV32 }, | ||
{ .str = "riscv64", .val = CS_MODE_RISCV64 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In response to above. Please stick with the Capstone flag naming convention.
Is RSICV32
is anything else than just 32bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a point, but I don't know, RISCV[32|64] and RISCVC were legacy flags that were already in the old plugin, so I kept them.
Removing them would probably not affect anything but just be a hassle, it could also be the case that the generated code references them from lots of places and changing the generated code is always a hassle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generated code references them
Which one? The test files from the MCUpdater? Those ones can be replaced as described here #2756 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rot127 sorry for the late reply, yes, only the yaml MC tests reference those values, plus a few references in handwritten code in the cstool.c
and so on, so it's doable to replace them, will do.
Although this might break APIs, because I see some Python usage from my editor search, until now I have avoided deleting enum members from includes/riscv.h
, that's why we still have CS_MODE_RISCV_C
and CS_MODE_RISCVC
.
cstool/cstool.c
Outdated
{ "+c", "Enables RISCV C extension", { | ||
CS_ARCH_RISCV, CS_ARCH_MAX }, 0, CS_MODE_RISCV_C }, | ||
{ "+fd", "Enables RISCV F and D extensions ", { | ||
CS_ARCH_RISCV, CS_ARCH_MAX }, 0, CS_MODE_RISCV_FD}, | ||
{ "+v", "Enables RISCV V extension ", { | ||
CS_ARCH_RISCV, CS_ARCH_MAX }, 0, CS_MODE_RISCV_V}, | ||
{ "+inx", "Enables RISCV Zfinx, Zdinx, and Zhinx extensions," | ||
" zhinxmin is also enabled as it's subset of zhinx ", { | ||
CS_ARCH_RISCV, CS_ARCH_MAX }, 0, CS_MODE_RISCV_ZFINX}, | ||
{ "+zcmp-t-e", "Enables the following RISCV code size reduction extensions: zcmp, zcmt and zce", { | ||
CS_ARCH_RISCV, CS_ARCH_MAX }, 0, CS_MODE_RISCV_ZCMP_ZCMT_ZCE}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like many modes are missing:
CS_MODE_RISCV_C = 1 << 2, ///< RISCV compressed instructure mode
CS_MODE_RISCV_FD = 1 << 3,
CS_MODE_RISCV_V = 1 << 4,
CS_MODE_RISCV_ZFINX = 1 << 5,
CS_MODE_RISCV_ZCMP_ZCMT_ZCE = 1 << 6,
CS_MODE_RISCV_ZICFISS = 1 << 7,
CS_MODE_RISCV_E = 1 << 8,
CS_MODE_RISCV_A = 1 << 9,
CS_MODE_RISCV_COREV = 1 << 10,
CS_MODE_RISCV_THEAD = 1 << 11,
CS_MODE_RISCV_SIFIVE = 1 << 12,
CS_MODE_RISCV_BITMANIP = 1 << 13,
CS_MODE_RISCV_ZBA = 1 << 14,
CS_MODE_RISCV_ZBB = 1 << 15,
CS_MODE_RISCV_ZBC = 1 << 16,
CS_MODE_RISCV_ZBKB = 1 << 17,
CS_MODE_RISCV_ZBKC = 1 << 18,
CS_MODE_RISCV_ZBKX = 1 << 19,
CS_MODE_RISCV_ZBS = 1 << 20,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, ideally we need at least one cstool test per mode. Just my thoughts. @Rot127 WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be independent flags like you would setup gcc/clang when specifying the cpu options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my thoughts. @Rot127 WDYT?
Yes, good idea. For the sake of simplicity we can implement them in Python in a new sub directory tests/integration/cstool
I would suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wargio sorry for the late reply, but do you mean we should have a seperate flag per each RISCV extension ? my impression is that we said that the goal is to combine as much as possible into coarse-grained flags.
(I recognize that at the moment some RISCV extensions don't feature in any flags, that just means that no test failed because they were enabled, so they're always catch-all enabled in the feautre-checking logic. We can always add flags later.)
I think we can't have a seperate flag per extension because RISCV kept dividing and sub-dividing extensions till they exceeded 32, and we're using an int32 to store the feature bitvector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but do you mean we should have a seperate flag per each RISCV extension ?
Yes, for each individual one, not for combinations. We don't know what the use cases of people are. So allowing for finer configs is important.
Also there are not that many if I am not mistaken?
https://en.wikichip.org/wiki/risc-v/standard_extensions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the fine-grained config, but the link is probably incomplete, off the top of my head I can't see the half-percision floating point extension. I'm sure we find others not listed if we look carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also group some of the LLVM extensions in a reasonable way.
We don't need and should not copy them one to one.
E.g. casually looking at RISCVFeatures.td
seems to have common prefixes for the different Z extensions. Where Zb
, Zc
, Zf
etc. are certain extensions and Zfhmin
, Zfh
etc, are just sub-categories.
So just creating an extension for Zf
, Zb
, Zc
etc. would be sufficient IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check out the helper functions in Mapping.h. They should always be used for these tasks. Because they do bounds checking and such. Accessing the struct members directly (e.g. like details->groups[details->groups_count]
) should be the absolute exception, not the rule.
You can also see how the helpers are used in other architectures and just do it the same way. LoongArch is a good example, because it is relatively simple.
AArch64 has many very complex operands. You can check its code if you need examples for these cases.
The conflicts are coming from the clang-format formatting we did recently btw. So nothing to preserve there. |
} \ | ||
} | ||
|
||
#define compare_string_from_int_ret(actual, expected, converter, ret_val) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea
// for weird reasons some instructions end up with valid operands that are | ||
// interspersed with invalid operands, i.e. the operands array is an "island" | ||
// of valid operands with invalid gaps between them, this function will compactify | ||
// all the valid operands and pad the rest of the array to invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you debugged one or two of them? It could be an indicator something went wrong during the C++ -> C translation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, didn't think of that.. the operands array already behaves strangely, for example it can be filled out of order (the operand at [1] will be put in the array before the operand at [0]), I just dismissed those erratic behaviours as how the generated code behaves
I will see one of them in details.
…son, and riscv.h comments for generated content added
…test failures: 1230 out of 4757
…ated, it doesn't compile yet but is mostly valid C except for a few problems
12882dc
to
1d59145
Compare
…ift amounts, as it's a bug wtih LLVM upstream
|
||
float getFPImm(unsigned Imm) | ||
{ | ||
CS_ASSERT(Imm != 1 && Imm != 30 && Imm != 31 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imm
can still be greater than the array boundaries.
Please check this as well.
Your checklist for this pull request
Detailed description
This PR is attempting to refactor the RISCV architecture to be updatable from the Auto-Sync tool, such that it automatically follows RISCV-in-LLVM additions and fixes. Following the refactoring guide
depends-on: capstone-engine/llvm-capstone#83
Test plan
...
Closing issues
...