Skip to content

Comments

Fix locs#111

Merged
ltratt merged 2 commits intoykjit:mainfrom
vext01:fix-locs
Apr 1, 2025
Merged

Fix locs#111
ltratt merged 2 commits intoykjit:mainfrom
vext01:fix-locs

Conversation

@vext01
Copy link
Contributor

@vext01 vext01 commented Apr 1, 2025

See commits.

Requires: ykjit/yk#1670

@vext01
Copy link
Contributor Author

vext01 commented Apr 1, 2025

Hold on, let me force push formatting pushes to the second commit. Forgot that # is a comment in a git commit.

src/lparser.c Outdated
* `PrintCode()` in `luac.c`. Note that we have to deduct one because luac
* prints bytecode pcs starting from 1.
*/
int loop_pc = INT_MAX;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like:

int loop_pc;
if ((GET_OPCODE(i) == OP_JMP) && (GETARG_sJ(i) < 0)) {
    loop_pc = GETARG_sJ(i) + pc + 2 - 1;
} else if (GET_OPCODE(i) == OP_FORLOOP) {
    loop_pc = pc - GETARG_Bx(i) + 2 - 1;
} else { return; }
f->yklocs[loop_pc] = yk_location_new();

to avoid the INT_MAX edge case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually why not just:

if ((GET_OPCODE(i) == OP_JMP) && (GETARG_sJ(i) < 0)) {
    f->kylocs[GETARG_sJ(i) + pc + 2 - 1] = yklocation_new();
} else if (GET_OPCODE(i) == OP_FORLOOP) {
    f->yklocs[pc - GETARG_Bx(i) + 2 - 1 = yk_location_new();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine if we don't want the assert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The INT_MAX aspect is probably incorrect on some platforms, so foregoing the assert seems OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully f0b504b is a decent compromise.

I considered doing 2 loops:

  • one to initialise every pc with a null location
  • a second to overwrite nulls with non-null locations where necessary

This would be more fool-proof, but less performant.

@ltratt
Copy link
Contributor

ltratt commented Apr 1, 2025

I am not too worried by the slowdowns: that was kind of inevitable. List has always been on the edge of timeout.

@vext01
Copy link
Contributor Author

vext01 commented Apr 1, 2025

I am not too worried by the slowdowns: that was kind of inevitable. List has always been on the edge of timeout.

I see.

FWIW, the benchmark does a lot of recursion. My guess is that we don't handle that well, but that's one for another day.

@ltratt
Copy link
Contributor

ltratt commented Apr 1, 2025

Agreed. Let's get this in.

@ltratt
Copy link
Contributor

ltratt commented Apr 1, 2025

Works for me! Please squash.

vext01 added 2 commits April 1, 2025 16:08
Example output:

```
$ ./src/luac -l while_loop.lua

main <while_loop.lua:0,0> (11 instructions at 0x62c51d0)
0+ params, 3 slots, 1 upvalue, 1 local, 1 constant, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     LOADI           0 300
ykloc:  3       [2]     EQI             0 0 1
        4       [2]     JMP             6       ; to 11
        5       [3]     ADDI            0 0 -1
        6       [3]     MMBINI          0 1 7 0 ; __sub
        7       [4]     GETTABUP        1 0 0   ; _ENV "print"
        8       [4]     MOVE            2 0
ykloc:  9       [4]     CALL            1 2 1   ; 1 in 0 out
        10      [4]     JMP             -8      ; to 3
        11      [5]     RETURN          1 1 1   ; 0 out
```

(note that the locations have been added in the wrong place, and this
change is to help me fix that)
The placement of YkLocations was based upon incomplete bytecode
instructions, leading to nonsense placement of locations.

The first commit adjusts `luac` to make it show location placement in
its output. The second commit fixes the placement.

Examples follow. Pay attendtion to the "to X" annotations that luac
prints on the ends of lines that may jump.

for loops
=========

```lua
for i = 0, 300 do
    print(i)
end
```

before
------

```
main <for_loop.lua:0,0> (10 instructions at 0x10b181d0)
0+ params, 6 slots, 1 upvalue, 4 locals, 1 constant, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     LOADI           0 0
        3       [1]     LOADI           1 300
        4       [1]     LOADI           2 1
        5       [1]     FORPREP         0 3     ; exit to 10
        6       [2]     GETTABUP        4 0 0   ; _ENV "print"
ykloc:  7       [2]     MOVE            5 3
        8       [2]     CALL            4 2 1   ; 1 in 0 out
        9       [1]     FORLOOP         0 4     ; to 6
        10      [3]     RETURN          0 1 1   ; 0 out
```

after
-----

```
main <for_loop.lua:0,0> (10 instructions at 0x1995d1d0)
0+ params, 6 slots, 1 upvalue, 4 locals, 1 constant, 0 functions
      	1	[1]	VARARGPREP	0
      	2	[1]	LOADI    	0 0
      	3	[1]	LOADI    	1 300
      	4	[1]	LOADI    	2 1
      	5	[1]	FORPREP  	0 3	; exit to 10
ykloc:	6	[2]	GETTABUP 	4 0 0	; _ENV "print"
      	7	[2]	MOVE     	5 3
      	8	[2]	CALL     	4 2 1	; 1 in 0 out
      	9	[1]	FORLOOP  	0 4	; to 6
      	10	[3]	RETURN   	0 1 1	; 0 out
```

while loops
===========

```lua
local i  = 300
while i ~= 0 do
    i = i - 1
    print(i)
end
```

before
------

```
main <while_loop.lua:0,0> (11 instructions at 0x3837c1d0)
0+ params, 3 slots, 1 upvalue, 1 local, 1 constant, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     LOADI           0 300
ykloc:  3       [2]     EQI             0 0 1
        4       [2]     JMP             6       ; to 11
        5       [3]     ADDI            0 0 -1
        6       [3]     MMBINI          0 1 7 0 ; __sub
        7       [4]     GETTABUP        1 0 0   ; _ENV "print"
        8       [4]     MOVE            2 0
ykloc:  9       [4]     CALL            1 2 1   ; 1 in 0 out
        10      [4]     JMP             -8      ; to 3
        11      [5]     RETURN          1 1 1   ; 0 out
```

after
-----

```
main <while_loop.lua:0,0> (11 instructions at 0x3bd111d0)
0+ params, 3 slots, 1 upvalue, 1 local, 1 constant, 0 functions
      	1	[1]	VARARGPREP	0
      	2	[1]	LOADI    	0 300
ykloc:	3	[2]	EQI      	0 0 1
      	4	[2]	JMP      	6	; to 11
      	5	[3]	ADDI     	0 0 -1
      	6	[3]	MMBINI   	0 1 7 0	; __sub
      	7	[4]	GETTABUP 	1 0 0	; _ENV "print"
      	8	[4]	MOVE     	2 0
      	9	[4]	CALL     	1 2 1	; 1 in 0 out
      	10	[4]	JMP      	-8	; to 3
      	11	[5]	RETURN   	1 1 1	; 0 out
```

nested loops
============

```lua
local i  = 300
while i ~= 0 do
    i = i - 1
    for i = 0, 10 do
        print(i)
    end
end
```

before
------

```
main <nested_loops.lua:0,0> (16 instructions at 0x206921a0)
0+ params, 7 slots, 1 upvalue, 5 locals, 1 constant, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     LOADI           0 300
ykloc:  3       [2]     EQI             0 0 1
        4       [2]     JMP             11      ; to 16
        5       [3]     ADDI            0 0 -1
        6       [3]     MMBINI          0 1 7 0 ; __sub
        7       [4]     LOADI           1 0
        8       [4]     LOADI           2 10
        9       [4]     LOADI           3 1
        10      [4]     FORPREP         1 3     ; exit to 15
        11      [5]     GETTABUP        5 0 0   ; _ENV "print"
ykloc:  12      [5]     MOVE            6 4
        13      [5]     CALL            5 2 1   ; 1 in 0 out
ykloc:  14      [4]     FORLOOP         1 4     ; to 11
        15      [6]     JMP             -13     ; to 3
        16      [7]     RETURN          1 1 1   ; 0 out
```

after
-----

```
main <nested_loops.lua:0,0> (16 instructions at 0x132f21a0)
0+ params, 7 slots, 1 upvalue, 5 locals, 1 constant, 0 functions
      	1	[1]	VARARGPREP	0
      	2	[1]	LOADI    	0 300
ykloc:	3	[2]	EQI      	0 0 1
      	4	[2]	JMP      	11	; to 16
      	5	[3]	ADDI     	0 0 -1
      	6	[3]	MMBINI   	0 1 7 0	; __sub
      	7	[4]	LOADI    	1 0
      	8	[4]	LOADI    	2 10
      	9	[4]	LOADI    	3 1
      	10	[4]	FORPREP  	1 3	; exit to 15
ykloc:	11	[5]	GETTABUP 	5 0 0	; _ENV "print"
      	12	[5]	MOVE     	6 4
      	13	[5]	CALL     	5 2 1	; 1 in 0 out
      	14	[4]	FORLOOP  	1 4	; to 11
      	15	[6]	JMP      	-13	; to 3
      	16	[7]	RETURN   	1 1 1	; 0 out
```

Most benchmarks speed up, in many cases by a lot. For example:

 - Richards is about 4x faster than before.
 - Bounce, CD, Json and Permute are about 2x faster than before.

Outliers that slow down as a result of this change are:
 - NBody is about 85% slower.
 - List times out when run under rebench.

Raw data follows:

```
 BigLoop     Lua          yk     1000000000         6  6628
 BigLoop     YkLuaAfter   yk     1000000000         6  2566
 BigLoop     YkLuaBefore  yk     1000000000         6  2567
             YkLuaAfter/YkLuaBefore                     1.0

 Bounce      Lua          awfy         1500         6  1397
 Bounce      YkLuaAfter   awfy         1500         6  2607
 Bounce      YkLuaBefore  awfy         1500         6  4663
             YkLuaAfter/YkLuaBefore                    0.56

 CD          Lua          awfy          250         6  3391
 CD          YkLuaAfter   awfy          250         6  18852
 CD          YkLuaBefore  awfy          250         6  41803
             YkLuaAfter/YkLuaBefore                    0.45

 DeltaBlue   Lua          awfy        12000         6  935
 DeltaBlue   YkLuaAfter   awfy        12000         6  5379
 DeltaBlue   YkLuaBefore  awfy        12000         6  6906
             YkLuaAfter/YkLuaBefore                    0.78

 Havlak      Lua          awfy         1500         6  7693
 Havlak      YkLuaAfter   awfy         1500         6  26934
 Havlak      YkLuaBefore  awfy         1500         6  32157
             YkLuaAfter/YkLuaBefore                    0.83

 Json        Lua          awfy          100         6  1379
 Json        YkLuaAfter   awfy          100         6  7049
 Json        YkLuaBefore  awfy          100         6  12444
             YkLuaAfter/YkLuaBefore                    0.56

 List        Lua          awfy         1500         6  1090
 List        YkLuaAfter   awfy         1500         0  Failed
 List        YkLuaBefore  awfy         1500         6  10970
             YkLuaAfter/YkLuaBefore                    timeout

 Mandelbrot  Lua          awfy          500         6  446
 Mandelbrot  YkLuaAfter   awfy          500         6  242
 Mandelbrot  YkLuaBefore  awfy          500         6  312
             YkLuaAfter/YkLuaBefore                    0.77

 NBody       Lua          awfy       250000         6  1045
 NBody       YkLuaAfter   awfy       250000         6  2203
 NBody       YkLuaBefore  awfy       250000         6  1185
             YkLuaAfter/YkLuaBefore                    1.85

 Permute     Lua          awfy         1000         6  1302
 Permute     YkLuaAfter   awfy         1000         6  1630
 Permute     YkLuaBefore  awfy         1000         6  2947
             YkLuaAfter/YkLuaBefore                    0.55

 Queens      Lua          awfy         1000         6  925
 Queens      YkLuaAfter   awfy         1000         6  1466
 Queens      YkLuaBefore  awfy         1000         6  5858
             YkLuaAfter/YkLuaBefore                    0.25

 Richards    Lua          awfy          100         6  5276
 Richards    YkLuaAfter   awfy          100         6  11747
 Richards    YkLuaBefore  awfy          100         6  49269
             YkLuaAfter/YkLuaBefore                    0.24

 Sieve       Lua          awfy         3000         6  1056
 Sieve       YkLuaAfter   awfy         3000         6  3959
 Sieve       YkLuaBefore  awfy         3000         6  6108
             YkLuaAfter/YkLuaBefore                    0.64

 Storage     Lua          awfy         1000         6  2462
 Storage     YkLuaAfter   awfy         1000         6  4341
 Storage     YkLuaBefore  awfy         1000         6  4681
             YkLuaAfter/YkLuaBefore                    0.93

 Towers      Lua          awfy          600         6  1487
 Towers      YkLuaAfter   awfy          600         6  15125
 Towers      YkLuaBefore  awfy          600         6  15133
             YkLuaAfter/YkLuaBefore                    1.0
```
@vext01
Copy link
Contributor Author

vext01 commented Apr 1, 2025

Squashed, but note this needs ykjit/yk#1670 first.

@vext01 vext01 marked this pull request as ready for review April 1, 2025 15:09
@ltratt ltratt added this pull request to the merge queue Apr 1, 2025
Merged via the queue into ykjit:main with commit b26b1ba Apr 1, 2025
2 checks passed
@vext01 vext01 deleted the fix-locs branch April 1, 2025 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants