-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathscratchwork.txt
2571 lines (1948 loc) · 103 KB
/
scratchwork.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Goal: Small scripting language that can be embedded anywhere, though not necessarily with the compiler attached.
Features:
Lisp syntax
No dependency on any library.
C/C++
Drawbacks:
Lisp syntax
Redundant code due to lack of libraries.
Types
constant
name
function
Composite Types
enum
array
struct
union
Objects
int
bool
enum
array
string
struct
union
function
There will be a parent table called _. This table is the parent of all global variables.
There is no type difference between a string and an identifier. A string constant can contain any character. An identifier must start with a letter or identifier_symbol and the rest of the characters must be letters, identifier_symbols or numbers.
If an identifier is passed to a expression or subexpression, the value will be used in the operation if possible. If the identifier cannot be found in the variable list, then the identifier will be converted into a string literal.
Every expression is a table. When executing a table, the first value must be a function.
("a" "b" "c") is {"a", "b", "c"}.
((a ('a' 'b' 'c')) 'b' 'c')
a/b/c
a/b:
When calling a function, there are two tables. The first is the table that passed to the function. The second is the table that defines how the function is called.
[function f [a b c] [
[return [+ a b c]]
]]
[f 1 2 3]
The calling list is f [1 2 3]
The definition list is f [a b c]
This is the same call expressed in a different way: [f f.b:2 f.c:3 f.a:1], or [f .b:2 .c:3 .a:1] (shorthand) This can be represented in the AST, but it will not necessarily execute sucessfully if the members are defined outside the function call.
evaluateSubexpression -> Returns a bool, an int, a float, a string, or an identifier.
identifierToValue
stringToIdentifier
"ABI"
before
stack base
args length (always > 0) (DuckLisp FP points here)
function name (string)
arg1
arg2
arg3
arg4
...
loc0
loc1
loc2
loc3
...
after
stack base
args length (always > 0) (DuckLisp FP points here)
function name (string)
arg1
arg2
arg3
arg4
...
loc0
loc1
loc2
loc3
...
ret0
ret1
ret2
ret3
...
rets length
Data tree
Access by name is fast.
Hashes of a given string must always return the same value no matter the scope.
An object's parent object is its namespace.
An object may have children.
Can have two methods of execution
Tree walk
Virtual machine
Function types
Bytecode generation
Appends or inserts bytecode into the block.
C callback
Inserts C function call into block in bytecode wrapper.
This class of functions is called by a bytecode generation function.
Heterogeneous
Calls bytecode and callback functions.
nop
add8
add16
add32
add64
sub8
sub16
sub32
sub64
push
pop
return
#call bytecode
#call *bytecode
#call C
#call *C
#add *float *float
#add *float float
#add *int *int
#add *int int
#sub *float *float
#sub *float float
#sub *int *int
#sub *int int
It turns out that this was not a bug in the allocator. Hurrah!
(dl_memoryBlock_t) { /* 90 */
.block = 4A690AF, /* offset = 111 */
.block_size = 2,
.allocated = true,
.unlinked = false,
.previousBlock = 44,
.nextBlock = 94,
},
(dl_memoryBlock_t) { /* 44 */
.block = 4A690AE, /* offset = 110 */
.block_size = 1,
.allocated = true,
.unlinked = false,
.previousBlock = 42,
.nextBlock = 90,
},
I'm going to split the compiler and VM to a certain extent. DuckLisp will always have the VM, but it won't always have the compiler. This will allow me to use as little memory as possible on small devices. For example, I doubt MicroComp will be able to compile DuckLisp for a while, even after a C compiler is ported to it.
Constant propigation
All the function has to do is be marked as pure. If it is pure and all arguments are constants, then the function may be pre-calculated in that instance.
Tail call
Is a recursive call the last function called? Replace the arguments and jump to the beginning.
Bookmark:
Figure out how bytecode chunks are structured in relation to each other before they are merged.
Function definitions can all be placed at the end of the program.
Function calls will only be generated after the function has been defined.
Make `compile` generate function call bytecode and call generators.
I have encountered a need for temporary variables. The question is, how do I allocate and free them? Once I've solved that, is there an easy way to do it more efficiently?
The problem with recursion is lack of context. With a stack, I can just pop the last value to see the context. With a function call, I can't see anything above the current function.
--- Two months later ---
^^^ Helpful, but I'm still confused.
Stack is for placing objects on during program execution, right?
Scope stack is for placing symbols on during program compilation.
These symbols are variables and generators.
My code does not entirely reflect the above.
I seem to have VM functions that manipulate the stack. If I am to completely seperate the VM from the compiler, then I need to move them to the VM module. C callbacks should not be passed to the compiler. The compiler should return a mapping of names to callback slots, and the callbacks should be placed in those slots during VM initalization. These slots will likely be stored on the stack.
Generators will generate high-level bytecode. This bytecode is inserted into a linked list that doubles as an array. Branch targets will be stored as indexes to the array, which will allow instructions to be inserted between the branch and the target with no side effects.
Functions are compiled to bytecode, and placed after the `defun` instruction. `defun` then pushes the address of the function on the stack. `call` will take the stack address and extract the address of the function. The function will then be called.
C functions are not stored in bytecode. They are stored on a separate stack dedicated to C callbacks. The `ccall` instruction will take the callback stack address and then run the function in that position. Forward references to a label are dealt with by inserting a pseudo-instruction to act as the target. When the bytecode emitter comes across this pseudo-instruction, it immediately replaces it with the first of its own instructions.
To ease the load on my brain, I think I will add an intermediate representation. This is solely for the purpose of dealing with branch targets.
Compiler sees expression. Compiler looks up function name in the compiled function list. If it is there, a function call is generated. Compiler looks up function name in C function list. If it is there, a C function call is generated. Compiler looks up name in generator function list. If it is there, the compiler passes the expression to the generator. The generator emits bytecode. If a generator does not exist, a compile error is thrown.
Recursive descent could work for optimization. A generator can not see anything above it, but it can see below it. The generator can traverse down the tree as far as far as it desires, and if it sees an optimization, it can rearrange the tree. The generator will then return and the compiler will traverse the tree by one node. It will then call the generator for that function. This will continue until the whole tree has been traversed.
Each high-level bytecode instruction will contain an opcode class and arguments. An opcode is an enum. Arguments are unions that can have the types integer, float, string, or label. Labels are `ptrdiff_t`s that point to an element of the bytecode array.
High-level bytecode is assembled into raw bytecode. The opcode is determined from the opcode class and argument sizes, and branch targets are calculated from the final instruction lengths.
Perhaps I should create generators that emit a single instruction so that it is possible to write VM assembly directly in the program? I could also add an assembler that accepts text assembly files. I doubt they would be too hard to parse.
If I go with all of the above, we will have these modules:
DuckLib.so
duckLisp.so
duckVM.so
duckAsm.so
We traverse the tree top-down because we want to allow the generators to optimize the the tree if they wish.
There are three stacks. "The Stack" is the runtime VM stack. The generator stack contains generators. The function stack may contain C functions.
All functions are anonymous in the VM.
Tree → list strategy:
Generator:
Check arguments.
Reorder arguments in tree.
Create list of instruction list fragments. These lists will be joined with the compiled arguments, which may themselves have similar trees of instruction list fragments. It looks sort of like this:
ast = [a, b, c]
generator:
instructionList = [i0, i1, i2]
order = [2, 1, 3]
reorder(ast, order)
for i in range(len(ast)):
newInstructionList.append(instructionList[i])
newInstructionList.append(ast[i])
Expand tree into instruction list. This is easy, since we just traverse the tree and append each leaf to the end of the list.
bytecode file format:
((ascii8[2] DL) (uint16 <callbacks length>) (uint32 <bytecode length>) (uint8[<bytecode length>] <bytecode>))
[[i0 i1 i2] [i3 i4 i5] [i6 i7 i8] [
[[i9 i10 i11] [i12 i13 i14] [i15 i16 i17] [
[[i18 i19 i20] [i21 i22 i23]]
]]
[[i24 i25 i26] [i27 i28]]
]]
node = [[instruction*]* [node*]]
instructions = [instruction*]
instructionsList = [instructions*]
nodes = [node*]
node = [instructionsList nodes]
node = dl_array_t:dl_array_t
nodes = dl_array_t:dl_array_t
instructionsList = dl_array_t:dl_array_t
instructions = dl_array_t:duckLisp_instructionObject_t
instruction = duckLisp_instructionObject_t
Almost done with compilation. I have decided on the strategy of giving each generator its own piece of the instruction list. The problem is that arguments must be evaluated left-to-right to allow the program to execute top-down.
(7 (3 (1) (2)) (6 (4) (5)))
(
(defun copy-function ((f pointer:function) (size size_t))
(var g pointer:function)
(setq g (malloc size))
(copy-bytes (cast g pointer:byte) (cast f pointer:byte) (* size size-of(byte)))
(return g))
(defun f () 0)
(var p-g pointer:function)
(copy-function p-g (addr-of f))
(defmacro g ()
#(call p-g))
(g))
(
(defun # #
(var # #)
(setq g (malloc size))
(copy-bytes (cast g #) (cast f #) (* size (size-of byte)))
(return g))
(defun f # #)
(var # #)
(copy-function p-g (addr-of f))
(defmacro # #
'(call p-g))
(g))
(
(defun
(var)
(setq g (malloc size))
(copy-bytes (cast g) (cast f) (* size (size-of byte)))
(return g))
(defun f)
(var)
(copy-function p-g (addr-of f))
(defmacro
'(call p-g))
(g))
(
(defun
(var)
(setq g (malloc size))
(copy-bytes (cast g) (cast f) (* size (size-of byte)))
(return g))
(defun f)
(var)
(copy-function p-g (addr-of f))
(defmacro
'(call p-g))
(g))
(() ()): Left-right -
(f () ()): Undefined - Whatever
(f (g ())): Outside-in - Tree, top-down, right-left
Outer expressions compile before inner.
Inner expressions' assembly come before outer's.
Left expressions' assembly come before right's.
Outer expressions compile before inner.
Inner expressions' assembly comes after outer's.
Left expressions' assembly comes before right's.
(7 (3 (1) (2)) (6 (4) (5)))
(1 (5 (7) (6)) (2 (4) (3)))
(1 (2 (3) (4)) (5 (6) (7)))
(1 (5 (6) (7)) (2 (3) (4)))
(1 (5 (7) (6)) (2 (4) (3)))
(5(2 (*1) (*1)) (4 (*3) (*3)))
(7 (3 (1) (2)) (6 (4) (5)))
(7 (3 (1) (2)) (6 (4) (5)))
(7 (3 (1) (2)) (6 (4) (5)))
7 +
3 +
1 +
2 -
6 -
4 +
5 -
Required objects:
Return stack
Index? May be built into the array.
Current node
+ Create array. Push array in current node. Push array on stack. Set array as current node. Push instruction list in current node.
0 Push instruction list in current node.
- Push instruction list in current node. Pop array from stack. Set current node to popped array?
+ Create array. Push array in current node. Push array on stack. Set array as current node. Push instruction list in current node.
0 Push instruction list in current node.
- Push instruction list in current node. Pop array from stack. Set current node to popped array?
I will need a local struct for this.
struct node_s {
union {
dl_array_t *node;
dl_array_t *instructions;
}
dl_bool_t isNode;
} node_t;
rrealloc: Instead of adding memory to the end of the block, it adds it to the beginning.
Move pusher to top of loop. Done.
Generate tree in pusher.
node
instructions
node
node
node
First element in node is always an instruction array.
Store node addresses on stack, not nodes.
We have a dedicated node struct now.
All nodes will be kept in a dl_array.
Nodes will keep an array of nodes by storing the index in the master array.
Indices will be pushed on the stack.
Needed universal identifier for each node → index to a single array.
Needed each nodes to contain a list of nodes → indeces to a subset of the array elements.
Desired a way to keep track of all nodes in use → single array containing nodes.
Desired a way to use dl_array instead of raw memory allocation.
Whew! That's done.
The VM is a modified stack machine. It might be more helpful to think of it as a Harvard architecture machine with a data memory that can grow infinitely.
Each function call creates its own environment. This is mainly a feature of the compiler. When a function is called, local variables are pushed onto the stack. At the end of the function they are popped. DL functions are stored in program memory. Functions are DL objects that contain either a pointer to a DL function or a pointer to a C callback. Since all objects are stored on the stack, all functions are local variables that can be pushed and popped. Variables are referenced by index relative to the current frame pointer. The reason for this is that the stack pointer changes too often to easily keep track of indexes using it, and absolute addressing takes up a lot of space. In most cases, the frame pointer will remain constant for the entire duration of a function. If another function is called, the callee will need to use an offset relative to the frame pointer of the scope that contains the function.
stack — variables
variable — generic data
"ABI"
before — Push, call, push
stack base
args length (always > 0) (DuckLisp FP points here)
function name (string)
arg1
arg2
arg3
arg4
...
loc0
loc1
loc2
loc3
...
after — Push
stack base
args length (always > 0) (DuckLisp FP points here)
function name (string)
arg1
arg2
arg3
arg4
...
loc0
loc1
loc2
loc3
...
ret0
ret1
ret2
ret3
...
rets length
cleanup — Return, copy
ret0 ← Rets are now locs owned by the caller
ret1 ← Rets are now locs owned by the caller
ret2 ← Rets are now locs owned by the caller
ret3 ← Rets are now locs owned by the caller
arg3 ← Stack top
arg4
...
loc0
loc1
loc2
loc3
...
ret0
ret1
ret2
ret3
...
rets length
Calls to DL from C are done by pushing objects on the stack and then calling the function. Return values are placed on the top of the stack for C to pop.
The stack is persistent between calls.
Bytecode is not persistent between calls. Bytecode can be run straight by the VM, or a function that is currently on the stack can be called. A bytecode function pointed to by the stack will point to a random block of memory. C will be given no indication of when it is freed.
This is a one-shot run of some bytecode. This may define functions that can be used by other chunks of bytecode. Dangling pointers may result if `bytecode` is freed.
e = duckVM_execute(&duckVM, bytecode, bytecode_length);
This is a function call. DVM knows nothing about function names, so an index must be given instead.
e = duckVM_pushObject(&duckVM, duckLispObject);
e = duckVM_call(&duckVM, function_index); // Index is an absolute address on the stack.
e = duckVM_popObject(&duckVM, &duckLispObject);
Solved:
C → DL calling conventions.
DL → C/DL calling conventions?
Local variables.
Unsolved:
Non-local scope addressing. Real CPUs deal with this by storing functions in memory, not on the stack.
A label may only have one destination address.
A label may have many source addresses.
A label will have its index placed in a trie right after it is assembled to bytecode.
A backward goto will calculate its jump distance when it is converted to bytecode.
A forward goto will use 32 bits and enter its address into a trie (with label as key) that points to an array with pointers to all the gotos to that label.
(goto <label>) — Create label and goto array if not already created. Emit high-level jump instruction.
(label <label>) — Create label if not already created. Emit label pseudo instruction.
label trie points to an array of links?
Jump — Insert current address in car of link array.
Label — Insert current address in cdr of link array.
Goto trie will contain pointers to arrays of source addresses.
Label trie will contain destination addresses.
After bytecode list is generated and goto and label tries are populated, create jump link array.
I did the label scoping wrong. Scoping should always be done in the generators, but I did it in bytecode generation. What I am currently doing is giving the label name (a string) as the instruction argument. What I should be doing is looking up the label index in the label trie (which may have been constructed in the same generator). The index, not the name, should be passed as the instruction argument. During code generation, the index is used to populate the element of the label array that contains the jump structure.
I dealt with the label scoping problem. It works better. The problem I'm having now is again forward references. Apparently forward references are a problem for scoping in addition to bytecode generation. If a goto or label (they are the essentially the same) is placed inside a scope and another goto or label has been placed after the scope, two labels will be created, one inside and one outside. If a goto or label preceeds the scope, only one label is created.
Rules:
A label returns no value. It is a pseudo-instruction.
A label may only appear in the top-level array of a closed scope expression.
Allowed:
(
(label "cleanup"))
Disallowed:
(+ a b (label "cleanup"))
(push-scope (label "cleanup"))
This rule is partially an extension of the first rule. This allows the () expression to create the label before its children search for it.
Escape sequences are now expanded in CST → AST.
Discord Hanabi chat:
<discord user=an_origamian>
I've been working on branching.
Scoping for it is done and wasn't too hard, though the current implementation only allows labels in the top level of an expression with an expression as its function part.
```lisp
;; Labels allowed here.
(
(label l)
(nop)
(goto l))
;; Labels disallowed here.
(+ (label l) 5 (goto l))
;; Labels disallowed here.
(+ (push-scope)
(label l)
(nop)
(goto l)
(pop-scope))
;; Labels allowed here.
(+ (
(label l)
(nop)
(goto l))
5
6)
```
.
Labels are pseudo-instructions that have no bytecode representation.
Gotos are jump instructions.
At the moment, labels and gotos return no value.
```lisp
;; Jump8 uses a one-byte relative address.
(jump8 offset) ; Two bytes long.
;; Jump16 uses a two-byte relative address.
(jump16 offset-LSB offset-MSB) ; Three bytes long.
;; Jump32 uses a four-byte relative address.
(jump32 offset-LSB offset-ML offset-MH offset-MSB) ; Five bytes long.
```
When a jump instruction executes, the instruction pointer (IP) points first to the opcode and then to each of the bytes in the operand. After the relative address has been extracted from the instruction, the IP points to the instruction after the jump. The relative address is then added to this value of the IP, and the VM continues executing code from the new address.
.
The compiler portion of the `compile` function generates high-level assembly. Instead of emitting a jump8 instruction, it would simply emit the jump high-level instruction. This then gets converted to one of the above three variants of jump by the assembler portion of `compile`. The question is, how does the assembler choose which instruction to emit? It seems simple at first. Just subtract the source of the jump from the target label. If all jump instructions are the five-byte form, this works just fine, but if the jump instructions vary in length, then it becomes more difficult to calculate the real target address. Take this for example:
```lisp
(
(label l)
<130 nops>
(goto l))
```
This is compiled to (using a representation of the internal high-level assembly)
```lisp
(
(label 0)
<130 nops>
(jump 0))
```
`0` is the index of the label in the label array. It does not correspond to the actual address in any way.
And now down to bytecode:
```lisp
0 <130 nops>
130 <jump8, 16, or 32?>
```
To determine the size of the instruction, we need to know the jump distance. This is easy. Distance ≈ 0 - 130 = -130. |-130| > 127, so we need to use the 16-bit version. The distance is measured from the last byte of the jump instruction, so we also have to add that to the naive total. -130 - 3 = -133.
```lisp
0 <130 nops>
130 jump16 16'd-133
```
`16'd<number>` is Verilog's notation for 16-bit decimal numbers. It's quite convenient for this sort of thing.
Easy! It's just a bit of arithmetic.
Here's where it gets complicated.
```lisp
(
(label l)
(label m)
<124 nops>
(goto l)
(goto m))
```
↓
```lisp
(
(label 0)
(label 1)
<124 nops>
jump 0
jump 1)
```
↓
```lisp
0 <124 nops>
124 jump?? ??
124+? jump?? ??
```
Now how do we calculate this? It's not that much more difficult to figure out, but if we focus on the second jump it illustrates the point. If we assume the first jump is jump8, then the whole instruction will be two bytes long. If we assume the second jump is jump8, it will also be two bytes. Label `m` will be 0-(124+2+2) = -128 bytes forward. Since the size of the address fits in eight bits using two's compliment negative numbers, jump8 is sufficient to reach the target.
```lisp
0 <124 nops>
124 jump8 8'd-126
126 jump8 8'd-128
```
And it works out! This only happened because of the assumption that each offset fit in a single byte. If we use jump16, it no longer fits.
```lisp
0 <124 nops>
124 jump16 16'd-127
127 jump16 16'd-130
```
It would be possible for the assembler to optimize this to use jump8s, but assuming the smallest size instruction and working up to largest is much easier to program.
.
Needless to say, most programs are going to have a ton of branches, and determining the size will not necessarily be easy.
One approach is to iterate over all the branches in the bytecode and gradually expand each one until the offsets of each one fit into the instruction. This requires that extra bytes are inserted into the middle of the bytecode. To do this with an array, the bytecode memory block would have to be reallocated and then a portion of the bytecode would have to be copied to make room for the inserted instruction. This is inefficient. Another approach is to make the bytecode an array containing links in a list. This has the advantage that links can be inserted into the middle of the list and each link can still be accessed by its index in the array. New list links are pushed onto the end of the array. The problem is that since incrementing the array index does not necessarily mean incrementing the list index, distance can no longer be determined by subtracting the source index from the target index. A third approach is to create a system of equations describing the relationships between each branch. Once the equation is solved and addresses are mathematically calculated, distance is no longer required, so all that needs to be done is rewrite the jump opcodes and insert links after them with the correct address. I don't think creating the equation will be hard, but solving it will be another matter since it will either have if statements or logs.
I've read that optimizing the code size of addressing is not something that programmers worried about even in the 70s, but I still think it would be nice to have.
.
`asize(a)` = `ceiling(log128(a)) + 1`
```c
asize(0) = 0
asize(1) = 1
asize(-1) = 1
asize(127) = 1
asize(128) = 2
asize(-128) = 1
asize(-129) = 2
```
`index` = Index of byte in bytecode array before addresses are inserted.
Source:
```lisp
(
(label l)
(label m)
<124 nops>
(goto l)
(goto m))
```
Assembled bytecode:
```txt
<124 nop opcodes>
; No operand yet. Just opcode.
jump8
jump8
```
Annotations:
```txt
l0
l1
<124 nops>
b0 jump l0 a0
b1 jump l1 a1
```
Left field is an address.
Middle field is an opcode.
Middle-left field is a label so you know where it goes.
Right field is an address associated with the opcode. It is not actually in the bytecode array.
This is the system of equations that result.
```c
l0 = index
l1 = index
b0 = index
a0 = l0 - (b0 + asize(a0))
b1 = b0 + 1 + asize(a0)
a1 = l1 - (b1 + asize(a1))
```
This expands to
```c
l0 = 0
l1 = 0
b0 = 124
a0 = l0 - (b0 + asize(a0))
b1 = b0 + 1 + asize(a0)
a1 = l1 - (b1 + asize(a1))
```
```c
a0 = 0 - (124 + asize(a0))
b1 = b0 + 1 + asize(a0)
a1 = 0 - (b1 + asize(a1))
```
```c
a0 = -ceiling(log128(a0)) - 125
b1 = ceiling(log128(a0)) + 2
a1 = -ceiling(log128(a1)) - b1 - 1
```
Wheeeee!!!! Logs and ceilings, and maybe even if statements! I have no idea how to solve this without a trial and error approach. Trial and error may be an acceptable solution.
</discord>
1. Start with best estimate using 2-byte jump instructions.
2. Clear the "not done" flag. Set the cumulative offset to zero. For each link… (in order)
a. Add cumulative offset.
b. Calculate required instruction size.
c. Set "not done" flag if current instruction size is too small.
d. Increase instruction size to required instruction size if necessary.
3. If "not done" flag is set, goto 2. Else, goto 4.
4. Reallocate the bytecode to account for the new size of the branches.
5. Write the branch instructions and relocate the incumbent bytecode.
Sort:
One array is sorted.
One array is unsorted.
The destination array is double the size of the other two.
Array is filled and sorted.
I'm writing generators now. They are a pain, simply because dynamic typing requires detecting many types.
Solutions:
1. Replace the type tree with a single type struct. Will require lots of work, but will reduce the code complexity throughout duck-lisp.
2. Make a list of DL API functions. Refer the chart when writing generators.
Did #1.
Writing generators should be a bit easier now.
The iterative tree traversal is a pain. Let's try making it recursive.
The compiler will change.
The assembler may change.
The generators will change.
The parser will not change.
The disassembler will not change.
The emitters should not change.
Steps:
source → reader ⇒ AST
(AST → generator ⇒ AST) & (AST → emitter ⇒ assembly)
assembly → assembler ⇒ bytecode
bytecode → disassembler ⇒ disassembly
Code structure:
loader
reader
compiler
generator
generator
emitter
assembler
optimizer
disassembler
Fibonacci: {
push-integer.8 00 ; (var a 0)
push-integer.8 01 ; (var b 1)
push-integer.8 00 ; (var c 0)
; (label loop)
push-index.8 00 ; (print a)
c-call.8 00
pop.8 01
push-string.8 01 "\n" ; (print "\n")
c-call.8 00
pop.8 01
add.8 00 01 ; (+ a b)
move.8 03 02 ; (setq c %)
pop.8 01
move.8 00 01 ; (setq b a)
move.8 02 00 ; (setq a c)
push-index.8 00 ; (print a)
c-call.8 00
pop.8 01
push-string.8 01 "\n" ; (print "\n")
c-call.8 00
pop.8 01
push-integer.16 E803 ; (print 1000)
c-call.8 00
pop.8 01
push-string.8 01 "\n" ; (print "\n")
c-call.8 00
pop.8 01
push-integer.16 E803 ; (< 1000 a)
less.8 03 00
c-call.8 00 ; (print %)
brnz.8 C0 ; (brnz % loop)
pop.8 02
nop ; (nop)
jump.8 BB ; (goto loop)
return
}
I ran into a problem I didn't expect. In order to calculate the condition for a branch, I need to push objects on the stack. In order to keep the stack balanced, I need to pop those items off the stack. So I guess I will do something similar to what I did to `progn` and pop all arguments required in the calculation before the branch. The number of arguments to pop will be given to `br??` as an additional integer argument. This argument will probably have to come after the jump offset in order to remain compatible with the current jump size optimization.
What the final version will look like:
push-integer.16 E803 ; (< 1000 a)
less.8 03 00
c-call.8 00 ; (print %)
+brnz.8 C0 02 ; (brnz % loop)
-brnz.8 C0 ; (brnz % loop)
-pop.8 02
-
-nop ; (nop)
-
-jump.8 BB ; (goto loop)
The top scope seems to always be unused. This should not be deleted without investigation because it may be possible for the top-level expression to declare a local.
Woohoo! Branching works!
label 0 multiply-loop
goto 1 multiply-end
label 2 add-loop
goto 3 add-end
label 3 add-end
goto 2 add-loop
goto 0 multiply-loop
label 1 multiply-end
label multiply-loop
goto multiply-end
label add-loop
goto add-end
goto add-loop
label add-end
goto multiply-loop
label multiply-end
Problem fixed by making labels greater than gotos when a tie occurs when sorting.
Macros:
Create VM.
Push arguments as locals.
Run macro.
Retrieve and paste result.
Destroy VM.
Functions:
Save PC on call stack.
Push arguments as locals.
Jump to function.
Run function.
Set PC to top of call stack, pop all function arguments and locals, and push the result.
Functions will be placed in the bytecode in relation to where they were defined in the source. The first instruction of a function will be a jump to the end of the function. This is so that program flow can pass through a function without running it. It *is* a hack, but right now I'm probably too lazy to do anything more complicated like relocating the function body.
(defun 1+ (a)
(print a)
(print (+ a 1)))
(print (1+ 5))
(goto --g###1)
label 1+
push-integer 1
(print -1)
add -2 -1
(print -1)
return 2
(label --g###1)
push-integer 5
call 1+
ccall print
All stack addressing is relative to the frame pointer.
Bookmark:
Add implicit progn to function body.
Return instruction
pop
reset frame pointer
return
call instruction
save frame pointer
jump
Bookmark:
Use frame pointer.
Find error messages that don't actually throw an error and force them to.
FUNCTIONS WORK!!!
REPL time!
I need to free my memory. Here's the list of all resources:
main::duckLispMemory freed
duckLisp_init::duckLisp->source Needs to be reallocated in duckLisp_loadString and freed in duckLisp_quit
duckLisp_init::duckLisp->errors Needs to be reallocated in duckLisp_loadString and freed in duckLisp_quit
duckLisp_init::duckLisp->scope_stack Needs to be reallocated in duckLisp_loadString and freed in duckLisp_quit
duckLisp_init::duckLisp->generators_stack Needs to be freed in duckLisp_quit
duckLisp_init::duckLisp->labels_stack Needs to be reallocated in duckLisp_loadString and freed in duckLisp_quit
I think cst_append is fine.
duckLisp_quit needs to clean up a bunch of stuff.
Labels will need to be able to return absolute program addresses.
(defun length (list)
(var i 0)
(while (not (null? list))
(setq list (cdr list))
(setq i (1+ i)))
i)
(defun nreverse (list)
(var reversed-list (list))
(while (not (null? list))
(setq reversed-list (cons (car list)
reversed-list))
(setq list (cdr list)))
reversed-list)
(defun append (list1 list2)
(var appended-list (list))
(while (not (null? list1))
(setq appended-list (cons (car list1)
appended-list))
(setq list1 (cdr list1)))
(while (not (null? list2))
(setq appended-list (cons (car list2)
appended-list))
(setq list2 (cdr list2)))
(nreverse appended-list))
;; Edit: I don't believe this will work. It can capture variables, but it will create a copy instead of pointing to it.
(defmacro lambda (caps args &rest body)
(var name (gensym))
`(no-scope
(defun ,name ,args
,body)
(list (get-label ,name)
,(append args caps))))
;; This is complicated for a simple function call…
(defun funcall (function arguments)
(var label (car function))
(var args (car (cdr function)))
(var args-length (length args))
(while (not (and (null? args) (null? arguments)))
(push (if (null? (car args))
(
(var result (car arguments))
(setq arguments (cdr arguments))
result)
(car args)))
(setq args (cdr args)))
(call label))
Required keywords/functions
lambda
defmacro
gensym — partially implemented
quote — Implemented
list — Implemented
cons — Implemented
get-label — Implemented
funcall
car — Implemented
cdr — Implemented
null? — Implemented
push — partially implemented
call — Implemented
(setq a-closure (list func a b c))
move.8 is throwing an error. This is resolved and was probably caused by bad balancing.
It would be *nice* to intern symbols, but I don't know how easy that would be. If I omit symbols, string literals could be a problem, but then again, I could just put them inside strings. It shouldn't be too hard to detect a string literal. The only problem this causes is when algorithmically creating string literals, but in this case, just wrap quotes around the string to distinguish it.
So here are the disadvantages of not having a dedicated symbol type:
Symbol comparison is slow.
The difference between symbols and strings is that symbols have a specific string format.
There is now a new symbol data type. It does not yet have a master package. Edit: Now it does.
get-label prerequisites:
Fetch absolute address of labels. I may have to do more fixups. On the other hand, these addresses are absolute, so that means four bytes for every one of them.
Fetch labels by name, ID, or pointer. Labels lose all of their identity when they are copied into the links array, and multiple target duplicates are created.
I need to keep track of
Absolute address of target