Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
8320379: C2: Sort spilling/unspilling sequence for better ld/st mergi…
…ng into ldp/stp on AArch64 Macro-assembler on aarch64 can merge adjacent loads or stores into ldp/stp[1]. For example, it can merge: ``` str w20, [sp, openjdk#16] str w10, [sp, openjdk#20] ``` into ``` stp w20, w10, [sp, openjdk#16] ``` But C2 may generate a sequence like: ``` str x21, [sp, openjdk#8] str w20, [sp, openjdk#16] str x19, [sp, openjdk#24] <--- str w10, [sp, openjdk#20] <--- Before sorting str x11, [sp, openjdk#40] str w13, [sp, openjdk#48] str x16, [sp, openjdk#56] ``` We can't do any merging for non-adjacent loads or stores. The patch is to sort the spilling or unspilling sequence in the order of offset during instruction scheduling and bundling phase. After that, we can get a new sequence: ``` str x21, [sp, openjdk#8] str w20, [sp, openjdk#16] str w10, [sp, openjdk#20] <--- str x19, [sp, openjdk#24] <--- After sorting str x11, [sp, openjdk#40] str w13, [sp, openjdk#48] str x16, [sp, openjdk#56] ``` Then macro-assembler can do ld/st merging: ``` str x21, [sp, openjdk#8] stp w20, w10, [sp, openjdk#16] <--- Merged str x19, [sp, openjdk#24] str x11, [sp, openjdk#40] str w13, [sp, openjdk#48] str x16, [sp, openjdk#56] ``` To justify the patch, we run `HelloWorld.java` ``` public class HelloWorld { public static void main(String [] args) { System.out.println("Hello World!"); } } ``` with `java -Xcomp -XX:-TieredCompilation HelloWorld`. Before the patch, macro-assembler can do ld/st merging for 3688 times. After the patch, the number of ld/st merging increases to 3871 times, by ~5 %. Tested tier1~3 on x86 and AArch64. [1] https://github.com/openjdk/jdk/blob/a95062b39a431b4937ab6e9e73de4d2b8ea1ac49/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L2079
- Loading branch information