Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loongarch: Implement ll/sc support for atomic operations #254

Merged
merged 1 commit into from
Jul 5, 2024

Conversation

lrzlin
Copy link
Contributor

@lrzlin lrzlin commented Jun 30, 2024

Implement ll/sc for loongarch atomic load & store, also fix amcas bugs.
Tested under 3C5000 and 3A6000 (with amcas support).

case SLJIT_MOV_P:
ins = SC_D;
break;
case SLJIT_MOV_S32:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SLJIT_MOV_U32?

signed types should return SLJIT_ERR_UNSUPPORTED AFAIK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RISC-V and MIPS all use almost the same ll/sc for atomic load/store as LoongArch, and they all support signed types, so I think it wouldn't be an issue. BTW, I have noticed that in sljit_src/sljitNativeMIPS_common.c line 4234 has a SLJIT_CONFIG_RISCV_64 comment, should be a typo. @zherczeg

Copy link
Contributor

@carenas carenas Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signed types should return SLJIT_ERR_UNSUPPORTED AFAIK

I stand corrected, apparently the "spec" (see comments in sljitLir.h) was changed with 8bcd711, so guess that the ones still rejecting signed types might be the ones that are not up to it.

Eitherway you are still missing SLJIT_MOV_U32 there, right?, this work with 6d44b46 on top works fine in an emulated 3A5000 (without AMCAS)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay to use signed 32-bit ll/sc for 32-bit unsigned atomic load/store directly? I'm worry about boundary cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely not but is better than leaving it out as unsupported, and indeed I was surprised when it didn't break (as it did in a previous attempts to implement this as shown by #172).

which is why I was curious about the bug fixed, since it seems obvious that our test coverage needs some serious overhaul.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the reason why is that ll.d/sc.w is differ from ll.w/sc.w, but no matter what I don't think add unsigned 32-bit is a good solution, it was likely broken at the time we enhanced the test suite. Furthurmore, if the ll.w/sc.w could work well on loongarch, it should alse benifit mips & riscv, in that case let's add that in a seperate patch.

Copy link
Contributor

@carenas carenas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you elaborate on the nature of the bugs that were fixed?

@lrzlin
Copy link
Contributor Author

lrzlin commented Jul 1, 2024

could you elaborate on the nature of the bugs that were fixed?

Originally, SLJIT_MOV_S32 was missing so that when it was used, amcas will fall to amcas.d and leading to an incorrect result.

@zherczeg
Copy link
Owner

zherczeg commented Jul 2, 2024

Now the atomic instruction can tell the user whether it supports signed or unsigned operations (or both, but that is unlikely). Users can use sign/zero extension if they need it. MOV32 has no sign, it is up to the cpu, if registers are more than 32 bit long. riscv in mips is definitely a typo.

@lrzlin
Copy link
Contributor Author

lrzlin commented Jul 2, 2024

MOV32 has no sign, it is up to the cpu, if registers are more than 32 bit long. riscv in mips is definitely a typo.

Understand, I'll fix that typo, besides, are there still any problem for this patch?

case SLJIT_HAS_CLZ:
case SLJIT_HAS_CTZ:
case SLJIT_HAS_REV:
case SLJIT_HAS_ROT:
case SLJIT_HAS_PREFETCH:
case SLJIT_HAS_COPY_F32:
case SLJIT_HAS_COPY_F64:
case SLJIT_HAS_ATOMIC:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I see it correctly, that the cpu has new LL/SC form but CAS form is also supported? This is true for ARM64, so I am thinking whether we should allow users to pick which form they prefer. E.g. add a flag to pick the type, then return with unsupported if not available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoongArch originally only support LL/SC, but add CAS support on LoongArch V1.10 (Currently only 3A6000 support this extension, QEMU do not support it yet). So it's safe to always use LL/SC, but CAS give more fine-grained atomic operations support.

To allow user's selection, maybe we should add a type param to sljit_emit_atomic_load or sljit_emit_atomic_store, or we could add a SLJIT_HAS_CAS flag for it? But I still think there is not too much need to offer choices. Unless CAS is slower that LL/SC on some specific platform.

Reference: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#new-in-loongarch-v1.1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: AMCAS is faster than LL/SC on LoongArch

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will think about it.

@@ -2934,7 +2932,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
break;
case SLJIT_ATOMIC_STORED:
case SLJIT_ATOMIC_NOT_STORED:
FAIL_IF(push_inst(compiler, SLTUI | RD(dst_r) | RJ(EQUAL_FLAG) | IMM_I12(1)));
FAIL_IF(push_inst(compiler, SLTUI | RD(dst_r) | RJ(OTHER_FLAG) | IMM_I12(1)));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I changed it for convenience, you don't need to change it if using the equal flag is easier for the target cpu.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way should be okay, but I think it's better to stay the same with other RISC architectures for maintainability.

switch (GET_OPCODE(op)) {
case SLJIT_MOV:
case SLJIT_MOV_P:
ins = LL_D;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember the current status, but I suspect LL_D is not available for 32 bit systems. Do we only support 64 bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By now, almost every LoongArch chip are 64-bit, AFAIK, only 1c series mcu are 32-bit, so I think it's safe to support 64-bit only. If there were 32-bit chips come into market, we could follow the way RISC-V did, split this into sljitNativeLOONGARCH_common.c and so-on.

Copy link
Owner

@zherczeg zherczeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -3684,8 +3745,8 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_atomic_store(struct sljit_compiler
if (unsign)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unsign support can be removed in the future.

@zherczeg zherczeg merged commit ee88612 into zherczeg:master Jul 5, 2024
@lrzlin lrzlin deleted the loong-llsc branch July 5, 2024 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants