From 392a9584b739d9513c42e2aa94c4a5e64dd27a09 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sat, 18 Oct 2025 09:53:16 +0300 Subject: [PATCH 01/18] Add text format description --- Linking.md | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 191 insertions(+) diff --git a/Linking.md b/Linking.md index 0bf4723..ab610f3 100644 --- a/Linking.md +++ b/Linking.md @@ -734,3 +734,194 @@ necessary for referencing such segments (e.g. in `data.drop` or `memory.init` instruction) do not yet exist. - There is currently no support for table element segments, either active or passive. + +# Text format + +## Relocations + +Relocations are represented as WebAssembly annotations of the form +```wat +(@reloc ) +``` + +- `format` determines the resulting format of a relocation +|``| corresponding relocation constants | interpretation | +|----------|------------------------------------|---------------------| +|`i32` | `R_WASM_*_I32` | 4-byte [uint32] | +|`i64` | `R_WASM_*_I64` | 8-byte [uint64] | +|`leb` | `R_WASM_*_LEB` | 5-byte [varuint32] | +|`sleb` | `R_WASM_*_SLEB` | 5-byte [varint32] | +|`leb64` | `R_WASM_*_LEB` | 10-byte [varuint64] | +|`sleb64` | `R_WASM_*_SLEB` | 10-byte [varint64] | + +- `method` describes the type of relocation, so what kind of symbol we are relocating against and how to interpret that symbol. +| `` | symbol kind | corresponding relocation constants | interpretation | +|-------------|-------------|------------------------------------|-----------------------------------| +| `tag` | event* | `R_WASM_EVENT_INDEX_*` | Final WebAssembly event index | +| `table` | table* | `R_WASM_TABLE_NUMBER_*` | Final WebAssembly table index (index of a table, not into one) | +| `global` | global* | `R_WASM_GLOBAL_INDEX_*` | Final WebAssembly global index | +| `func` | function* | `R_WASM_FUNCTION_INDEX_*` | Final WebAssembly function index | +| `functable` | function | `R_WASM_TABLE_INDEX_*` | Index into the dynamic function table, used for taking address of functions | +| `text` | function | `R_WASM_FUNCTION_OFFSET` | Offset into the function body | +| `funcsec` | function | `R_WASM_SECTION_OFFSET` | Offset into a function section | +| `datasec` | data | `R_WASM_SECTION_OFFSET` | Offset into a data section | +| `customsec` | N/A | `R_WASM_SECTION_OFFSET` | Offset into a custom section | +| `data` | data | `R_WASM_MEMORY_ADDR_*` | WebAssembly linear memory address | +Symbol kinds marked with `*` are considered *primary*. + +- `modifier` describes the additional attributes that a relocation might have. +| `` | corresponding relocation constants | interpretation | +|--------------|---------------------------------------|-------------------| +| nothing | nothing | Normal relocation | +| `pic` | `R_WASM_*_LOCREL_*`, `R_WASM_*_REL_*` | Address relative to `env.__memory_base` or `env.__table_base`, used for dynamic linking | +| `tls` | `R_WASM_*_TLS*` | Address relative to `env.__tls_base`, used for thread-local storage | + +- `addend` describes the additional components of a relocation. +| `` | interpretation | condition | +|--------------|----------------------|-------------------------------------------------------------------| +| nothing | Zero addend | always | +| `+` | Positive byte offset | `method` allows addend | +| `-` | Negative byte offset | `method` allows addend and `format` is signed | +| `` | Byte offest to label | `method` allows addend and `method` is either `text` or `section` | + +- `symbol` describes the symbol against which to perform relocation. + - For `funcsec` relocation metod, this is the function id, so that if the addend is zero, the relocation points to the first instruction of that function. + - For `datasec` relocation metod, this is the data segment id, so that if the addend is zero, the relocation points to the first byte of data in that segment. + - For `customsec` relocation metod, this is the name of the custom section, so that if the addend is zero, the relocation points to the first byte of data in that segment. + - For other relocation metods, this denotes the symbol in the scope of that symbol kind. + +The relocation type is looked up from the combination of `format`, `method`, and `modifier`. If no relocation type exists, an error is raised. + +If a component of a relocation is predetermined, it must be skipped in the annotation text. +If a component of a relocation is defaulted, it may be skipped in the annotation text. +For example, a relocation into the function table by the index of `$foo` with a predetermined `format` would look like following: +```wat +(@reloc functable $foo) +``` +If all components of a relocation annotation are skipped, the annotation may be omitted. + +### Instruction relocations + +For every usage of an `typeidx`, `funcidx`, `globalidx`, `tagidx`, a relocation annotation is added afterwards, with `format` predefined as `leb`, `method` predefined as the *primary* method for that type, and `symbol` defaulted as the *primary* symbol of that `idx` + +For the `i32.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb`, and `method` is allowed to be either `data` or `functable`. +For the `i64.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb64`, and `method` is allowed to be either `data` or `functable`. +For the `i{32,64}.{load,store}*` instructions, a relocation annotation is added after the offset operand, with `format` predefined as `leb` if the *memory* being referenced is 32-bit, and `leb64` otherwise, and `method` predefined as `data`. + +### Data relocations + +In data segments, relocation annotations can be interleaved into the data string sequence. When that happens, relocations are situated after the last byte of the value being relocated. +For example, relocation of a 32-bit function pointer `$foo` into the data segment of size 4 would look like following: +```wat +(data (i32.const 0) "\00\00\00\00" (@reloc i32 functbl $foo)) +``` + +## Symbols + +Symbols are represented as WebAssembly annotations of the form +```wat +(@sym *) +``` +Data imports represented as WebAssembly annotations of the form +```wat +(@sym.import.data *) +``` + +- `name` is the symbol name written as WebAssembly `id`, it is the name by which relocation annotations reference the symbol. If it is not present, the symbol is considered *primary* symbol for that WebAssembly object, its name is taken from the related object + - There may only be one primary symbol for each WebAssembly object. + - If a symbol is not associated with an object, it may not be the primary symbol. + +- `qualifier` is one of the allowed qualifiers on a symbol declaration. Qualifiers may not repeat. + +| `` | effect | +|------------------|----------------------------------------------------------------------------------| +| `weak` | sets `WASM_SYM_BINDING_WEAK` symbol flag | +| `static` | sets `WASM_SYM_BINDING_LOCAL` symbol flag | +| `hidden` | sets `WASM_SYM_VISIBILITY_HIDDEN` symbol flag | +| `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | +| `thread_local` | sets `WASM_SYM_TLS` symbol flag | +| `size=` | sets symbol's `size` appropriately | +| `offset=` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | +| `name=` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets `name_len`, `name_data` | +| `priority=` | adds symbol to `WASM_INIT_FUNCS` section with the given priority | +| `comdat=` | adds symbol to a `comdat` with the given id | + +`priority` qualifier may only be applied to function symbols. +`size` and `offset` qualifiers may only be applied to data symbols. +`size` and `name` qualifiers must be applied to data symbols. +`name` qualifier must be applied to data imports. + +If all components of a symbol annotation are skipped, the annotation may be omitted. + +### WebAssembly object symbols + +For symbols related to WebAssembly objects, the sequence of symbol annotation occurs after the optional `id` of the declaration. +For example, the following code: +```wat +(import "env" "foo" (func (@sym $a retain name="a") (@sym $b hidden name="b") (param) (result))) +``` +declares 3 symbols: one primary symbol with the name of the index of the function, one symbol with the name `$a`, and one symbol with the name `$b`. + +### Data symbols + +Data symbol annotations can be interleaved into the data string sequence. When that happens, relocations are situated before the first byte of the value being defined. +For example, declaration of a 32-bit global by with the name `$foo` and linkage name "foo" would look like following: +```wat +(data (i32.const 0) (@sym $foo name="foo" size=4) "\00\00\00\00") +``` + +### Data imports + +Data imports occur in the same place as module fields. Data imports are always situated before data symbols. + +## COMDATs + +COMDATs are represented as WebAssembly annotations of the form +```wat +(@comdat ) +``` +where `id` is the WebAssembly name of the COMDAT, and `` is `name_len` and `name_str` of the `comdat`. + +COMDAT declarations occur in the same place as module fields. + +## Labels + +For some relocation types, an offset into a section/function is necessary. For these cases, lablels exsist. +Labels are represented as WebAssembly annotations of the form +```wat +(@sym.label ) +``` + +### Function labels +Function labels occur in the same place as instructions. A label always denotes the first byte of the next instruction, or the byte after end of function's instruction stream if there isn't a next instruction +Function label names are local to the function in which they occur. + +### Data labels +Data labels can be interleaved into the data string sequence. When that happens, relocations are situated after the last byte of the value being relocated. +Data label names are local to the data segment in which they occur. + +### Custom labels +Custom labels can be interleaved into the data string sequence. When that happens, relocations are situated after the last byte of the value being relocated. +Custom label names are local to the custom section in which they occur. + +## Data segment flags +Data segment flags are represented as WebAssembly annotations of the form +```wat +(@sym.segment *) +``` + +- `qualifier` is one of the allowed qualifiers on a data segment declaration. Qualifiers may not repeat. +| `` | effect | +|-----------------|-----------------------------------------------| +| `align=` | sets segment's `alignment` appropriately | +| `name=` | sets `name_len`, `name_data` | +| `strings` | sets `WASM_SEGMENT_FLAG_STRINGS` segment flag | +| `thread_local` | sets `WASM_SEGMENT_FLAG_TLS` segment flag | +| `retain` | sets `WASM_SEG_FLAG_RETAIN` segment flag | + +If `align` is not specified, it is given a default value of 1. +If `name` is not specified, it is given an empty default value. + +If all components of segment flags are skipped, the annotation may be omitted. + +Data segment annotation occurs after the optional `id` of the data segment declaration. From 51db201f010077905ceec87ba9b6f614337359d6 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sat, 18 Oct 2025 09:55:50 +0300 Subject: [PATCH 02/18] Fix GitHub SKILL ISSUE, take 1 --- Linking.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Linking.md b/Linking.md index ab610f3..7853c7e 100644 --- a/Linking.md +++ b/Linking.md @@ -745,6 +745,7 @@ Relocations are represented as WebAssembly annotations of the form ``` - `format` determines the resulting format of a relocation + |``| corresponding relocation constants | interpretation | |----------|------------------------------------|---------------------| |`i32` | `R_WASM_*_I32` | 4-byte [uint32] | @@ -755,6 +756,7 @@ Relocations are represented as WebAssembly annotations of the form |`sleb64` | `R_WASM_*_SLEB` | 10-byte [varint64] | - `method` describes the type of relocation, so what kind of symbol we are relocating against and how to interpret that symbol. + | `` | symbol kind | corresponding relocation constants | interpretation | |-------------|-------------|------------------------------------|-----------------------------------| | `tag` | event* | `R_WASM_EVENT_INDEX_*` | Final WebAssembly event index | @@ -770,6 +772,7 @@ Relocations are represented as WebAssembly annotations of the form Symbol kinds marked with `*` are considered *primary*. - `modifier` describes the additional attributes that a relocation might have. + | `` | corresponding relocation constants | interpretation | |--------------|---------------------------------------|-------------------| | nothing | nothing | Normal relocation | @@ -777,6 +780,7 @@ Symbol kinds marked with `*` are considered *primary*. | `tls` | `R_WASM_*_TLS*` | Address relative to `env.__tls_base`, used for thread-local storage | - `addend` describes the additional components of a relocation. + | `` | interpretation | condition | |--------------|----------------------|-------------------------------------------------------------------| | nothing | Zero addend | always | @@ -911,6 +915,7 @@ Data segment flags are represented as WebAssembly annotations of the form ``` - `qualifier` is one of the allowed qualifiers on a data segment declaration. Qualifiers may not repeat. + | `` | effect | |-----------------|-----------------------------------------------| | `align=` | sets segment's `alignment` appropriately | From e505cb0666922374c5b08e6940dbf9bc3bb44a4b Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sat, 18 Oct 2025 09:58:33 +0300 Subject: [PATCH 03/18] Fix GitHub SKILL ISSUE, take 2 --- Linking.md | 1 + 1 file changed, 1 insertion(+) diff --git a/Linking.md b/Linking.md index 7853c7e..1aae781 100644 --- a/Linking.md +++ b/Linking.md @@ -769,6 +769,7 @@ Relocations are represented as WebAssembly annotations of the form | `datasec` | data | `R_WASM_SECTION_OFFSET` | Offset into a data section | | `customsec` | N/A | `R_WASM_SECTION_OFFSET` | Offset into a custom section | | `data` | data | `R_WASM_MEMORY_ADDR_*` | WebAssembly linear memory address | + Symbol kinds marked with `*` are considered *primary*. - `modifier` describes the additional attributes that a relocation might have. From e314caf9e34a04d7abd75ae38ede02316f121ff3 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sat, 18 Oct 2025 12:26:07 +0300 Subject: [PATCH 04/18] Fix typos --- Linking.md | 86 +++++++++++++++++++++++++++++------------------------- 1 file changed, 46 insertions(+), 40 deletions(-) diff --git a/Linking.md b/Linking.md index 1aae781..1597d86 100644 --- a/Linking.md +++ b/Linking.md @@ -752,8 +752,8 @@ Relocations are represented as WebAssembly annotations of the form |`i64` | `R_WASM_*_I64` | 8-byte [uint64] | |`leb` | `R_WASM_*_LEB` | 5-byte [varuint32] | |`sleb` | `R_WASM_*_SLEB` | 5-byte [varint32] | -|`leb64` | `R_WASM_*_LEB` | 10-byte [varuint64] | -|`sleb64` | `R_WASM_*_SLEB` | 10-byte [varint64] | +|`leb64` | `R_WASM_*_LEB64` | 10-byte [varuint64] | +|`sleb64` | `R_WASM_*_SLEB64` | 10-byte [varint64] | - `method` describes the type of relocation, so what kind of symbol we are relocating against and how to interpret that symbol. @@ -790,10 +790,10 @@ Symbol kinds marked with `*` are considered *primary*. | `` | Byte offest to label | `method` allows addend and `method` is either `text` or `section` | - `symbol` describes the symbol against which to perform relocation. - - For `funcsec` relocation metod, this is the function id, so that if the addend is zero, the relocation points to the first instruction of that function. - - For `datasec` relocation metod, this is the data segment id, so that if the addend is zero, the relocation points to the first byte of data in that segment. - - For `customsec` relocation metod, this is the name of the custom section, so that if the addend is zero, the relocation points to the first byte of data in that segment. - - For other relocation metods, this denotes the symbol in the scope of that symbol kind. + - For `funcsec` relocation method, this is the function id, so that if the addend is zero, the relocation points to the first instruction of that function. + - For `datasec` relocation method, this is the data segment id, so that if the addend is zero, the relocation points to the first byte of data in that segment. + - For `customsec` relocation method, this is the name of the custom section, so that if the addend is zero, the relocation points to the first byte of data in that segment. + - For other relocation methods, this denotes the symbol in the scope of that symbol kind. The relocation type is looked up from the combination of `format`, `method`, and `modifier`. If no relocation type exists, an error is raised. @@ -807,11 +807,11 @@ If all components of a relocation annotation are skipped, the annotation may be ### Instruction relocations -For every usage of an `typeidx`, `funcidx`, `globalidx`, `tagidx`, a relocation annotation is added afterwards, with `format` predefined as `leb`, `method` predefined as the *primary* method for that type, and `symbol` defaulted as the *primary* symbol of that `idx` +For every usage of `typeidx`, `funcidx`, `globalidx`, `tagidx`, a relocation annotation is added afterwards, with `format` predefined as `leb`, `method` predefined as the *primary* method for that type, and `symbol` defaulted as the *primary* symbol of that `idx` -For the `i32.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb`, and `method` is allowed to be either `data` or `functable`. -For the `i64.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb64`, and `method` is allowed to be either `data` or `functable`. -For the `i{32,64}.{load,store}*` instructions, a relocation annotation is added after the offset operand, with `format` predefined as `leb` if the *memory* being referenced is 32-bit, and `leb64` otherwise, and `method` predefined as `data`. +- For the `i32.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb`, and `method` is allowed to be either `data` or `functable`. +- For the `i64.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb64`, and `method` is allowed to be either `data` or `functable`. +- For the `i{32,64}.{load,store}*` instructions, a relocation annotation is added after the offset operand, with `format` predefined as `leb` if the *memory* being referenced is 32-bit, and `leb64` otherwise, and `method` predefined as `data`. ### Data relocations @@ -838,29 +838,29 @@ Data imports represented as WebAssembly annotations of the form - `qualifier` is one of the allowed qualifiers on a symbol declaration. Qualifiers may not repeat. -| `` | effect | -|------------------|----------------------------------------------------------------------------------| -| `weak` | sets `WASM_SYM_BINDING_WEAK` symbol flag | -| `static` | sets `WASM_SYM_BINDING_LOCAL` symbol flag | -| `hidden` | sets `WASM_SYM_VISIBILITY_HIDDEN` symbol flag | -| `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | -| `thread_local` | sets `WASM_SYM_TLS` symbol flag | -| `size=` | sets symbol's `size` appropriately | -| `offset=` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | -| `name=` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets `name_len`, `name_data` | -| `priority=` | adds symbol to `WASM_INIT_FUNCS` section with the given priority | -| `comdat=` | adds symbol to a `comdat` with the given id | - -`priority` qualifier may only be applied to function symbols. -`size` and `offset` qualifiers may only be applied to data symbols. -`size` and `name` qualifiers must be applied to data symbols. -`name` qualifier must be applied to data imports. +| `` | effect | +|------------------|------------------------------------------------------------------------------------------------| +| `weak` | sets `WASM_SYM_BINDING_WEAK` symbol flag | +| `static` | sets `WASM_SYM_BINDING_LOCAL` symbol flag | +| `hidden` | sets `WASM_SYM_VISIBILITY_HIDDEN` symbol flag | +| `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | +| `thread_local` | sets `WASM_SYM_TLS` symbol flag | +| `size=` | sets symbol's `size` appropriately | +| `offset=` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | +| `name=` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets symbol's `name_len`, `name_data` appropriately | +| `priority=` | adds symbol to `WASM_INIT_FUNCS` section with the given priority | +| `comdat=` | adds symbol to a `comdat` with the given id | + +- The `priority` qualifier may only be applied to function symbols. +- The `size` and `offset` qualifiers may only be applied to data symbols. +- The `size` and `name` qualifiers must be applied to data symbols. +- The `name` qualifier must be applied to data imports. If all components of a symbol annotation are skipped, the annotation may be omitted. ### WebAssembly object symbols -For symbols related to WebAssembly objects, the sequence of symbol annotation occurs after the optional `id` of the declaration. +For symbols related to WebAssembly objects, the symbol annotation sequence occurs after the optional `id` of the declaration. For example, the following code: ```wat (import "env" "foo" (func (@sym $a retain name="a") (@sym $b hidden name="b") (param) (result))) @@ -870,7 +870,7 @@ declares 3 symbols: one primary symbol with the name of the index of the functio ### Data symbols Data symbol annotations can be interleaved into the data string sequence. When that happens, relocations are situated before the first byte of the value being defined. -For example, declaration of a 32-bit global by with the name `$foo` and linkage name "foo" would look like following: +For example, a declaration of a 32-bit global with the name `$foo` and linkage name "foo" would look like following: ```wat (data (i32.const 0) (@sym $foo name="foo" size=4) "\00\00\00\00") ``` @@ -891,22 +891,28 @@ COMDAT declarations occur in the same place as module fields. ## Labels -For some relocation types, an offset into a section/function is necessary. For these cases, lablels exsist. +For some relocation types, an offset into a section/function is necessary. For these cases, labels exsist. Labels are represented as WebAssembly annotations of the form ```wat (@sym.label ) ``` ### Function labels -Function labels occur in the same place as instructions. A label always denotes the first byte of the next instruction, or the byte after end of function's instruction stream if there isn't a next instruction +Function labels occur in the same place as instructions. +A label always denotes the first byte of the next instruction, or the byte after the end of the function's instruction stream, if there isn't a next instruction. + Function label names are local to the function in which they occur. ### Data labels -Data labels can be interleaved into the data string sequence. When that happens, relocations are situated after the last byte of the value being relocated. +Data labels can be interleaved into the data string sequence. +When that happens, relocations are situated after the last byte of the value being relocated. + Data label names are local to the data segment in which they occur. ### Custom labels -Custom labels can be interleaved into the data string sequence. When that happens, relocations are situated after the last byte of the value being relocated. +Custom labels can be interleaved into the data string sequence. +When that happens, relocations are situated after the last byte of the value being relocated. + Custom label names are local to the custom section in which they occur. ## Data segment flags @@ -917,13 +923,13 @@ Data segment flags are represented as WebAssembly annotations of the form - `qualifier` is one of the allowed qualifiers on a data segment declaration. Qualifiers may not repeat. -| `` | effect | -|-----------------|-----------------------------------------------| -| `align=` | sets segment's `alignment` appropriately | -| `name=` | sets `name_len`, `name_data` | -| `strings` | sets `WASM_SEGMENT_FLAG_STRINGS` segment flag | -| `thread_local` | sets `WASM_SEGMENT_FLAG_TLS` segment flag | -| `retain` | sets `WASM_SEG_FLAG_RETAIN` segment flag | +| `` | effect | +|-----------------|------------------------------------------------------| +| `align=` | sets segment's `alignment` appropriately | +| `name=` | sets segment's `name_len`, `name_data` appropriately | +| `strings` | sets `WASM_SEGMENT_FLAG_STRINGS` segment flag | +| `thread_local` | sets `WASM_SEGMENT_FLAG_TLS` segment flag | +| `retain` | sets `WASM_SEG_FLAG_RETAIN` segment flag | If `align` is not specified, it is given a default value of 1. If `name` is not specified, it is given an empty default value. From c741bc4a71001bb3caf24da69a96d924ad22aeac Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sat, 18 Oct 2025 12:33:27 +0300 Subject: [PATCH 05/18] Fix invalid addend condition, rename relocation methods --- Linking.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/Linking.md b/Linking.md index 1597d86..1e65240 100644 --- a/Linking.md +++ b/Linking.md @@ -764,9 +764,9 @@ Relocations are represented as WebAssembly annotations of the form | `global` | global* | `R_WASM_GLOBAL_INDEX_*` | Final WebAssembly global index | | `func` | function* | `R_WASM_FUNCTION_INDEX_*` | Final WebAssembly function index | | `functable` | function | `R_WASM_TABLE_INDEX_*` | Index into the dynamic function table, used for taking address of functions | -| `text` | function | `R_WASM_FUNCTION_OFFSET` | Offset into the function body | -| `funcsec` | function | `R_WASM_SECTION_OFFSET` | Offset into a function section | -| `datasec` | data | `R_WASM_SECTION_OFFSET` | Offset into a data section | +| `codeseg` | function | `R_WASM_FUNCTION_OFFSET` | Offset into the function body from the start of the function | +| `codesec` | function | `R_WASM_SECTION_OFFSET` | Offset into the function section | +| `datasec` | data | `R_WASM_SECTION_OFFSET` | Offset into the data section | | `customsec` | N/A | `R_WASM_SECTION_OFFSET` | Offset into a custom section | | `data` | data | `R_WASM_MEMORY_ADDR_*` | WebAssembly linear memory address | @@ -782,12 +782,12 @@ Symbol kinds marked with `*` are considered *primary*. - `addend` describes the additional components of a relocation. -| `` | interpretation | condition | -|--------------|----------------------|-------------------------------------------------------------------| -| nothing | Zero addend | always | -| `+` | Positive byte offset | `method` allows addend | -| `-` | Negative byte offset | `method` allows addend and `format` is signed | -| `` | Byte offest to label | `method` allows addend and `method` is either `text` or `section` | +| `` | interpretation | condition | +|--------------|----------------------|-----------------------------------------------| +| nothing | Zero addend | always | +| `+` | Positive byte offset | `method` allows addend | +| `-` | Negative byte offset | `method` allows addend and `format` is signed | +| `` | Byte offest to label | `method` is either `codeseg` or `*sec` | - `symbol` describes the symbol against which to perform relocation. - For `funcsec` relocation method, this is the function id, so that if the addend is zero, the relocation points to the first instruction of that function. From ea16adeb21d75a9a4b843aab3391e4f80de140e3 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sun, 19 Oct 2025 00:45:15 +0300 Subject: [PATCH 06/18] Fix formatting --- Linking.md | 118 ++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 84 insertions(+), 34 deletions(-) diff --git a/Linking.md b/Linking.md index 1e65240..9916e94 100644 --- a/Linking.md +++ b/Linking.md @@ -790,33 +790,61 @@ Symbol kinds marked with `*` are considered *primary*. | `` | Byte offest to label | `method` is either `codeseg` or `*sec` | - `symbol` describes the symbol against which to perform relocation. - - For `funcsec` relocation method, this is the function id, so that if the addend is zero, the relocation points to the first instruction of that function. - - For `datasec` relocation method, this is the data segment id, so that if the addend is zero, the relocation points to the first byte of data in that segment. - - For `customsec` relocation method, this is the name of the custom section, so that if the addend is zero, the relocation points to the first byte of data in that segment. - - For other relocation methods, this denotes the symbol in the scope of that symbol kind. - -The relocation type is looked up from the combination of `format`, `method`, and `modifier`. If no relocation type exists, an error is raised. - -If a component of a relocation is predetermined, it must be skipped in the annotation text. -If a component of a relocation is defaulted, it may be skipped in the annotation text. -For example, a relocation into the function table by the index of `$foo` with a predetermined `format` would look like following: + - For `funcsec` relocation method, this is the function id, so that if the + addend is zero, the relocation points to the first instruction of that + function. + - For `datasec` relocation method, this is the data segment id, so that if + the addend is zero, the relocation points to the first byte of data in that + segment. + - For `customsec` relocation method, this is the name of the custom section, + so that if the addend is zero, the relocation points to the first byte of + data in that segment. + - For other relocation methods, this denotes the symbol in the scope of that + symbol kind. + +The relocation type is looked up from the combination of `format`, `method`, +and `modifier`. If no relocation type exists, an error is raised. + +If a component of a relocation is predetermined, it must be skipped in the +annotation text. + +If a component of a relocation is defaulted, it may be skipped in the +annotation text. + +For example, a relocation into the function table by the index of `$foo` with a +predetermined `format` would look like following: ```wat (@reloc functable $foo) ``` -If all components of a relocation annotation are skipped, the annotation may be omitted. +If all components of a relocation annotation are skipped, the annotation may be +omitted. ### Instruction relocations -For every usage of `typeidx`, `funcidx`, `globalidx`, `tagidx`, a relocation annotation is added afterwards, with `format` predefined as `leb`, `method` predefined as the *primary* method for that type, and `symbol` defaulted as the *primary* symbol of that `idx` - -- For the `i32.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb`, and `method` is allowed to be either `data` or `functable`. -- For the `i64.const` instruction, a relocation annotation is added after the integer literal operand, with `format` predefined as `sleb64`, and `method` is allowed to be either `data` or `functable`. -- For the `i{32,64}.{load,store}*` instructions, a relocation annotation is added after the offset operand, with `format` predefined as `leb` if the *memory* being referenced is 32-bit, and `leb64` otherwise, and `method` predefined as `data`. +For every usage of `typeidx`, `funcidx`, `globalidx`, `tagidx`, a relocation +annotation is added afterwards, with `format` predefined as `leb`, `method` +predefined as the *primary* method for that type, and `symbol` defaulted as the +*primary* symbol of that `idx` + +- For the `i32.const` instruction, a relocation annotation is added after the + integer literal operand, with `format` predefined as `sleb`, and `method` is + allowed to be either `data` or `functable`. +- For the `i64.const` instruction, a relocation annotation is added after the + integer literal operand, with `format` predefined as `sleb64`, and `method` + is allowed to be either `data` or `functable`. +- For the `i{32,64}.{load,store}*` instructions, a relocation annotation is + added after the offset operand, with `format` predefined as `leb` if the + *memory* being referenced is 32-bit, and `leb64` otherwise, and `method` + predefined as `data`. ### Data relocations -In data segments, relocation annotations can be interleaved into the data string sequence. When that happens, relocations are situated after the last byte of the value being relocated. -For example, relocation of a 32-bit function pointer `$foo` into the data segment of size 4 would look like following: +In data segments, relocation annotations can be interleaved into the data +string sequence. When that happens, relocations are situated after the last +byte of the value being relocated. + +For example, relocation of a 32-bit function pointer `$foo` into the data +segment of size 4 would look like following: ```wat (data (i32.const 0) "\00\00\00\00" (@reloc i32 functbl $foo)) ``` @@ -832,11 +860,16 @@ Data imports represented as WebAssembly annotations of the form (@sym.import.data *) ``` -- `name` is the symbol name written as WebAssembly `id`, it is the name by which relocation annotations reference the symbol. If it is not present, the symbol is considered *primary* symbol for that WebAssembly object, its name is taken from the related object +- `name` is the symbol name written as WebAssembly `id`, it is the name by + which relocation annotations reference the symbol. If it is not present, the + symbol is considered *primary* symbol for that WebAssembly object, its name + is taken from the related object - There may only be one primary symbol for each WebAssembly object. - - If a symbol is not associated with an object, it may not be the primary symbol. + - If a symbol is not associated with an object, it may not be the primary + symbol. -- `qualifier` is one of the allowed qualifiers on a symbol declaration. Qualifiers may not repeat. +- `qualifier` is one of the allowed qualifiers on a symbol declaration. + Qualifiers may not repeat. | `` | effect | |------------------|------------------------------------------------------------------------------------------------| @@ -856,28 +889,37 @@ Data imports represented as WebAssembly annotations of the form - The `size` and `name` qualifiers must be applied to data symbols. - The `name` qualifier must be applied to data imports. -If all components of a symbol annotation are skipped, the annotation may be omitted. +If all components of a symbol annotation are skipped, the annotation may be +omitted. ### WebAssembly object symbols -For symbols related to WebAssembly objects, the symbol annotation sequence occurs after the optional `id` of the declaration. +For symbols related to WebAssembly objects, the symbol annotation sequence +occurs after the optional `id` of the declaration. + For example, the following code: ```wat (import "env" "foo" (func (@sym $a retain name="a") (@sym $b hidden name="b") (param) (result))) ``` -declares 3 symbols: one primary symbol with the name of the index of the function, one symbol with the name `$a`, and one symbol with the name `$b`. +declares 3 symbols: one primary symbol with the name of the index of the +function, one symbol with the name `$a`, and one symbol with the name `$b`. ### Data symbols -Data symbol annotations can be interleaved into the data string sequence. When that happens, relocations are situated before the first byte of the value being defined. -For example, a declaration of a 32-bit global with the name `$foo` and linkage name "foo" would look like following: +Data symbol annotations can be interleaved into the data string sequence. +When that happens, relocations are situated before the first byte of the value +being defined. + +For example, a declaration of a 32-bit global with the name `$foo` and linkage +name "foo" would look like following: ```wat (data (i32.const 0) (@sym $foo name="foo" size=4) "\00\00\00\00") ``` ### Data imports -Data imports occur in the same place as module fields. Data imports are always situated before data symbols. +Data imports occur in the same place as module fields. Data imports are always +situated before data symbols. ## COMDATs @@ -885,13 +927,15 @@ COMDATs are represented as WebAssembly annotations of the form ```wat (@comdat ) ``` -where `id` is the WebAssembly name of the COMDAT, and `` is `name_len` and `name_str` of the `comdat`. +where `id` is the WebAssembly name of the COMDAT, and `` is `name_len` +and `name_str` of the `comdat`. COMDAT declarations occur in the same place as module fields. ## Labels -For some relocation types, an offset into a section/function is necessary. For these cases, labels exsist. +For some relocation types, an offset into a section/function is necessary. For +these cases, labels exsist. Labels are represented as WebAssembly annotations of the form ```wat (@sym.label ) @@ -899,19 +943,23 @@ Labels are represented as WebAssembly annotations of the form ### Function labels Function labels occur in the same place as instructions. -A label always denotes the first byte of the next instruction, or the byte after the end of the function's instruction stream, if there isn't a next instruction. +A label always denotes the first byte of the next instruction, or the byte +after the end of the function's instruction stream, if there isn't a next +instruction. Function label names are local to the function in which they occur. ### Data labels Data labels can be interleaved into the data string sequence. -When that happens, relocations are situated after the last byte of the value being relocated. +When that happens, relocations are situated after the last byte of the value +being relocated. Data label names are local to the data segment in which they occur. ### Custom labels Custom labels can be interleaved into the data string sequence. -When that happens, relocations are situated after the last byte of the value being relocated. +When that happens, relocations are situated after the last byte of the value +being relocated. Custom label names are local to the custom section in which they occur. @@ -921,7 +969,8 @@ Data segment flags are represented as WebAssembly annotations of the form (@sym.segment *) ``` -- `qualifier` is one of the allowed qualifiers on a data segment declaration. Qualifiers may not repeat. +- `qualifier` is one of the allowed qualifiers on a data segment declaration. +Qualifiers may not repeat. | `` | effect | |-----------------|------------------------------------------------------| @@ -936,4 +985,5 @@ If `name` is not specified, it is given an empty default value. If all components of segment flags are skipped, the annotation may be omitted. -Data segment annotation occurs after the optional `id` of the data segment declaration. +Data segment annotation occurs after the optional `id` of the data segment +declaration. From a5f894e8cbca81039b1dc5ec01d6204be9128055 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sun, 19 Oct 2025 01:05:54 +0300 Subject: [PATCH 07/18] Introduce binding and visibility qualifiers --- Linking.md | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/Linking.md b/Linking.md index 9916e94..dbe9aae 100644 --- a/Linking.md +++ b/Linking.md @@ -871,18 +871,36 @@ Data imports represented as WebAssembly annotations of the form - `qualifier` is one of the allowed qualifiers on a symbol declaration. Qualifiers may not repeat. -| `` | effect | -|------------------|------------------------------------------------------------------------------------------------| -| `weak` | sets `WASM_SYM_BINDING_WEAK` symbol flag | -| `static` | sets `WASM_SYM_BINDING_LOCAL` symbol flag | -| `hidden` | sets `WASM_SYM_VISIBILITY_HIDDEN` symbol flag | -| `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | -| `thread_local` | sets `WASM_SYM_TLS` symbol flag | -| `size=` | sets symbol's `size` appropriately | -| `offset=` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | -| `name=` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets symbol's `name_len`, `name_data` appropriately | -| `priority=` | adds symbol to `WASM_INIT_FUNCS` section with the given priority | -| `comdat=` | adds symbol to a `comdat` with the given id | +| `` | effect | +|---------------------------|-----------------------------------------------| +| `binding=` | sets symbol flags according to `` | +| `visibility=` | sets symbol flags according to `` | +| `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | +| `thread_local` | sets `WASM_SYM_TLS` symbol flag | +| `size=` | sets symbol's `size` appropriately | +| `offset=` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | +| `name=` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets symbol's `name_len`, `name_data` appropriately | +| `priority=` | adds symbol to `WASM_INIT_FUNCS` section with the given priority | +| `comdat=` | adds symbol to a `comdat` with the given id | + +| `` | flag | +|-------------|--------------------------| +| `global` | 0 | +| `local` | `WASM_SYM_BINDING_LOCAL` | +| `weak` | `WASM_SYM_BINDING_WEAK` | + +| `` | flag | +|----------------|------------------------------| +| `default` | | +| `hidden` | `WASM_SYM_VISIBILITY_HIDDEN` | + +Shorthands may be used in place of full qualifiers: + +| shorthand | resulting qualifier | +|-----------|---------------------| +| `hidden` | `visibility=hidden` | +| `local` | `binding=local` | +| `weak` | `binding=weak` | - The `priority` qualifier may only be applied to function symbols. - The `size` and `offset` qualifiers may only be applied to data symbols. From d7ecb3046d3a1f2fd32e4eaff0f2106f057091f0 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sun, 19 Oct 2025 01:20:47 +0300 Subject: [PATCH 08/18] Make the example with data relocations more complex --- Linking.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Linking.md b/Linking.md index dbe9aae..b5d8768 100644 --- a/Linking.md +++ b/Linking.md @@ -843,10 +843,11 @@ In data segments, relocation annotations can be interleaved into the data string sequence. When that happens, relocations are situated after the last byte of the value being relocated. -For example, relocation of a 32-bit function pointer `$foo` into the data -segment of size 4 would look like following: +For example, relocation of a 32-bit function pointer `$foo` and a 32-bit +reference to a data symbol `$bar` into the data segment of size 8 would look +like following: ```wat -(data (i32.const 0) "\00\00\00\00" (@reloc i32 functbl $foo)) +(data (i32.const 0) "\00\00\00\00" (@reloc i32 functbl $foo) "\00\00\00\00" (@reloc i32 data $bar)) ``` ## Symbols From 79c827e32cd9038dbf08a9d4516e46fe22745be8 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Sun, 19 Oct 2025 01:55:53 +0300 Subject: [PATCH 09/18] Add an overview for the text format description --- Linking.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Linking.md b/Linking.md index b5d8768..2a38a54 100644 --- a/Linking.md +++ b/Linking.md @@ -737,6 +737,10 @@ passive. # Text format +The text format for linking metadata is intended for WAT consumers that wish to +emit relocatable object files, and WAT producers wish to emit human-readable +relocation metadata for later creation of a relocatable object file. + ## Relocations Relocations are represented as WebAssembly annotations of the form From 98955510c59aeea3c925deda5b8a70c525608123 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Mon, 20 Oct 2025 00:52:17 +0300 Subject: [PATCH 10/18] Add additional validation rules for object files --- Linking.md | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/Linking.md b/Linking.md index 2a38a54..4b20099 100644 --- a/Linking.md +++ b/Linking.md @@ -181,6 +181,47 @@ relocations applied to the CODE section, a relocation cannot straddle two functions, and for the DATA section relocations must lie within a data element's body. +### Additional validation rules + +When perfoming validation on object files, care must be taken to ensure that +meaningless relocations are not present in the binary. + +**Note**: Linker is not required to perform validation on its input object +files. + +When relocations occur in the CODE section, only the following relocations may +occur: + +| relocation type | condition the value at relocation offset | +|---------------------------------|------------------------------------------| +| `R_WASM_FUNCTION_INDEX_LEB` | must represent a `funcidx` | +| `R_WASM_TYPE_INDEX_LEB` | must represent a `typeidx` | +| `R_WASM_GLOBAL_INDEX_LEB` | must represent a `globalidx` | +| `R_WASM_EVENT_INDEX_LEB` | must represent a `tagidx` | +| `R_WASM_TABLE_NUMBER_LEB` | must represent a `tableidx` | +| `R_WASM_TABLE_INDEX_SLEB` | must represent an operand of `i32.const` | +| `R_WASM_TABLE_INDEX_SLEB64` | must represent an operand of `i64.const` | +| `R_WASM_MEMORY_ADDR_SLEB` | must represent an operand of `i32.const` | +| `R_WASM_MEMORY_ADDR_REL_SLEB` | must represent an operand of `i32.const` | +| `R_WASM_MEMORY_ADDR_TLS_SLEB` | must represent an operand of `i32.const` | +| `R_WASM_MEMORY_ADDR_SLEB64` | must represent an operand of `i64.const` | +| `R_WASM_MEMORY_ADDR_REL_SLEB64` | must represent an operand of `i64.const` | +| `R_WASM_MEMORY_ADDR_TLS_SLEB64` | must represent an operand of `i64.const` | +| `R_WASM_MEMORY_ADDR_LEB` | must represent the `offset` part of `memarg` where `memidx` references a 32-bit memory | +| `R_WASM_MEMORY_ADDR_LEB64` | must represent the `offset` part of `memarg` where `memidx` references a 64-bit memory | + +For `R_WASM_*_OFFSET_I*` relocations, the following condidions must hold for +the addend: + +- If `index` references the CODE section, the addend must represent the first + byte of an instruction, or the byte after the last instruction. +- If `index` references the DATA section, the addend must represent a valid + offset into a data segment's data area. +- If `index` references the custom section, the addend must represent a valid + offset into that custom section's data area. + +All other relocations are considered invalid for the purposes of validation + ## Linking Metadata Section A linking metadata section is a user-defined section with the name @@ -322,6 +363,8 @@ For section symbols: | ------------ | -------------- | ------------------------------------------- | | section | `varuint32` | the index of the target section | +Section symbols may only reference the CODE section, the DATA section, or custom sections. + The current set of valid flags for symbols are: - `1 / WASM_SYM_BINDING_WEAK` - Indicating that this is a weak symbol. When From c5270f7c57e52aefbf76783e12b6befcb792b699 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Mon, 20 Oct 2025 00:57:53 +0300 Subject: [PATCH 11/18] Turn the overlong leb note into a validation rule --- Linking.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Linking.md b/Linking.md index 4b20099..17a47e7 100644 --- a/Linking.md +++ b/Linking.md @@ -63,10 +63,6 @@ The "reloc." custom sections must come after the ["linking"](#linking-metadata-section) custom section in order to validate relocation indices. -Any LEB128-encoded values should be maximally padded so that they can be -rewritten without affecting the position of any other bytes. For instance, the -function index 3 should be encoded as `0x83 0x80 0x80 0x80 0x00`. - Relocations contain the following fields: | Field | Type | Description | @@ -189,6 +185,10 @@ meaningless relocations are not present in the binary. **Note**: Linker is not required to perform validation on its input object files. +All LEB128-encoded values that are to be relocated must be maximally padded so +that they can be rewritten without affecting the position of any other bytes. +For instance, the function index 3 must be encoded as `0x83 0x80 0x80 0x80 0x00`. + When relocations occur in the CODE section, only the following relocations may occur: From 1e66962869fb2849c7ceb32a942c92b776df26b3 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Mon, 20 Oct 2025 21:10:08 +0300 Subject: [PATCH 12/18] Revert "Turn the overlong leb note into a validation rule" This reverts commit c5270f7c57e52aefbf76783e12b6befcb792b699. --- Linking.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Linking.md b/Linking.md index 17a47e7..4b20099 100644 --- a/Linking.md +++ b/Linking.md @@ -63,6 +63,10 @@ The "reloc." custom sections must come after the ["linking"](#linking-metadata-section) custom section in order to validate relocation indices. +Any LEB128-encoded values should be maximally padded so that they can be +rewritten without affecting the position of any other bytes. For instance, the +function index 3 should be encoded as `0x83 0x80 0x80 0x80 0x00`. + Relocations contain the following fields: | Field | Type | Description | @@ -185,10 +189,6 @@ meaningless relocations are not present in the binary. **Note**: Linker is not required to perform validation on its input object files. -All LEB128-encoded values that are to be relocated must be maximally padded so -that they can be rewritten without affecting the position of any other bytes. -For instance, the function index 3 must be encoded as `0x83 0x80 0x80 0x80 0x00`. - When relocations occur in the CODE section, only the following relocations may occur: From 422f2c021c4f37d8d8b8c7f8e4e318564b58d3ee Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Mon, 20 Oct 2025 21:10:12 +0300 Subject: [PATCH 13/18] Revert "Add additional validation rules for object files" This reverts commit 98955510c59aeea3c925deda5b8a70c525608123. --- Linking.md | 43 ------------------------------------------- 1 file changed, 43 deletions(-) diff --git a/Linking.md b/Linking.md index 4b20099..2a38a54 100644 --- a/Linking.md +++ b/Linking.md @@ -181,47 +181,6 @@ relocations applied to the CODE section, a relocation cannot straddle two functions, and for the DATA section relocations must lie within a data element's body. -### Additional validation rules - -When perfoming validation on object files, care must be taken to ensure that -meaningless relocations are not present in the binary. - -**Note**: Linker is not required to perform validation on its input object -files. - -When relocations occur in the CODE section, only the following relocations may -occur: - -| relocation type | condition the value at relocation offset | -|---------------------------------|------------------------------------------| -| `R_WASM_FUNCTION_INDEX_LEB` | must represent a `funcidx` | -| `R_WASM_TYPE_INDEX_LEB` | must represent a `typeidx` | -| `R_WASM_GLOBAL_INDEX_LEB` | must represent a `globalidx` | -| `R_WASM_EVENT_INDEX_LEB` | must represent a `tagidx` | -| `R_WASM_TABLE_NUMBER_LEB` | must represent a `tableidx` | -| `R_WASM_TABLE_INDEX_SLEB` | must represent an operand of `i32.const` | -| `R_WASM_TABLE_INDEX_SLEB64` | must represent an operand of `i64.const` | -| `R_WASM_MEMORY_ADDR_SLEB` | must represent an operand of `i32.const` | -| `R_WASM_MEMORY_ADDR_REL_SLEB` | must represent an operand of `i32.const` | -| `R_WASM_MEMORY_ADDR_TLS_SLEB` | must represent an operand of `i32.const` | -| `R_WASM_MEMORY_ADDR_SLEB64` | must represent an operand of `i64.const` | -| `R_WASM_MEMORY_ADDR_REL_SLEB64` | must represent an operand of `i64.const` | -| `R_WASM_MEMORY_ADDR_TLS_SLEB64` | must represent an operand of `i64.const` | -| `R_WASM_MEMORY_ADDR_LEB` | must represent the `offset` part of `memarg` where `memidx` references a 32-bit memory | -| `R_WASM_MEMORY_ADDR_LEB64` | must represent the `offset` part of `memarg` where `memidx` references a 64-bit memory | - -For `R_WASM_*_OFFSET_I*` relocations, the following condidions must hold for -the addend: - -- If `index` references the CODE section, the addend must represent the first - byte of an instruction, or the byte after the last instruction. -- If `index` references the DATA section, the addend must represent a valid - offset into a data segment's data area. -- If `index` references the custom section, the addend must represent a valid - offset into that custom section's data area. - -All other relocations are considered invalid for the purposes of validation - ## Linking Metadata Section A linking metadata section is a user-defined section with the name @@ -363,8 +322,6 @@ For section symbols: | ------------ | -------------- | ------------------------------------------- | | section | `varuint32` | the index of the target section | -Section symbols may only reference the CODE section, the DATA section, or custom sections. - The current set of valid flags for symbols are: - `1 / WASM_SYM_BINDING_WEAK` - Indicating that this is a weak symbol. When From c4be34b49707336867f5ae349fd2deeac3529f67 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Tue, 21 Oct 2025 08:30:27 +0300 Subject: [PATCH 14/18] Clarify rules about symbol identifiers --- Linking.md | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/Linking.md b/Linking.md index 2a38a54..3d79e81 100644 --- a/Linking.md +++ b/Linking.md @@ -856,6 +856,15 @@ like following: ## Symbols +For each relocatable WebAssembly entity type, there exists a corresponding +symbol identifier namespaces for symbols of that type. + +Additionally, a symbol identifier namespace exists for data symbols. + +Symbol idenitfier namespaces differ from common index spaces in that they also +allow purely textual names in addition to numeric + optional textual names +allowed by index spaces. + Symbols are represented as WebAssembly annotations of the form ```wat (@sym *) @@ -870,8 +879,16 @@ Data imports represented as WebAssembly annotations of the form symbol is considered *primary* symbol for that WebAssembly object, its name is taken from the related object - There may only be one primary symbol for each WebAssembly object. - - If a symbol is not associated with an object, it may not be the primary - symbol. + - If a symbol is not associated with a WebAssembly entity, it may not be the + primary symbol. + +After a name for the symbol is determined, it is placed into the symbol +identifier namespace corresponding to that symbol type. + +> [!Note] +> As a consequence of that, the only symbols that can be referred to by a +> numeric index are _primary_ symbols, since they inherit their numeric index +> form the relocatable WebAssebly object. - `qualifier` is one of the allowed qualifiers on a symbol declaration. Qualifiers may not repeat. @@ -915,6 +932,11 @@ Shorthands may be used in place of full qualifiers: If all components of a symbol annotation are skipped, the annotation may be omitted. +> [!Note] +> Since all components of a symbol can be skipped, a _primary_ symbol always +> exists for all WebAssembly entities, even if the annotation without a `name` +> is not present in the symbol sequence + ### WebAssembly object symbols For symbols related to WebAssembly objects, the symbol annotation sequence From f01ca3aa361641ffdef3c9229febedae831c5fe3 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Tue, 21 Oct 2025 08:34:25 +0300 Subject: [PATCH 15/18] Replace ambiguous "Wasm object" by "WebAssembly entity" --- Linking.md | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/Linking.md b/Linking.md index 3d79e81..7bacd1e 100644 --- a/Linking.md +++ b/Linking.md @@ -290,11 +290,12 @@ where a `syminfo` is encoded as: | | | `5 / SYMTAB_TABLE` | | flags | `varuint32` | a bitfield containing flags for this symbol | -For functions, globals, events and tables, we reference an existing Wasm object, which -is either an import or a defined function/global/event/table (recall that the operand of a -Wasm `call` instruction uses an index space consisting of the function imports -followed by the defined functions, and similarly `get_global` for global imports -and definitions and `throw` for event imports and definitions). +For functions, globals, events and tables, we reference an existing WebAssembly +entity, which is either an import or a defined function/global/event/table +(recall that the operand of a Wasm `call` instruction uses an index space +consisting of the function imports followed by the defined functions, and +similarly `get_global` for global imports and definitions and `throw` for event +imports and definitions). If a symbols refers to an import, and the `WASM_SYM_EXPLICIT_NAME` flag is not set, then the name is taken from the @@ -302,7 +303,7 @@ import; otherwise the `syminfo` specifies the symbol's name. | Field | Type | Description | | ------------ | -------------- | ------------------------------------------- | -| index | `varuint32` | the index of the Wasm object corresponding to the symbol, which references an import if and only if the `WASM_SYM_UNDEFINED` flag is set | +| index | `varuint32` | the index of the WebAssembly entity corresponding to the symbol, which references an import if and only if the `WASM_SYM_UNDEFINED` flag is set | | name_len | `varuint32` ? | the optional length of `name_data` in bytes, omitted if `index` references an import | | name_data | `bytes` ? | UTF-8 encoding of the symbol name, omitted if `index` references an import | @@ -876,9 +877,9 @@ Data imports represented as WebAssembly annotations of the form - `name` is the symbol name written as WebAssembly `id`, it is the name by which relocation annotations reference the symbol. If it is not present, the - symbol is considered *primary* symbol for that WebAssembly object, its name - is taken from the related object - - There may only be one primary symbol for each WebAssembly object. + symbol is considered *primary* symbol for that WebAssembly entity, its name + is taken from the related entity + - There may only be one primary symbol for each WebAssembly entity. - If a symbol is not associated with a WebAssembly entity, it may not be the primary symbol. @@ -888,7 +889,7 @@ identifier namespace corresponding to that symbol type. > [!Note] > As a consequence of that, the only symbols that can be referred to by a > numeric index are _primary_ symbols, since they inherit their numeric index -> form the relocatable WebAssebly object. +> form the relocatable WebAssebly entity. - `qualifier` is one of the allowed qualifiers on a symbol declaration. Qualifiers may not repeat. @@ -937,9 +938,9 @@ omitted. > exists for all WebAssembly entities, even if the annotation without a `name` > is not present in the symbol sequence -### WebAssembly object symbols +### WebAssembly entity symbols -For symbols related to WebAssembly objects, the symbol annotation sequence +For symbols related to WebAssembly entity, the symbol annotation sequence occurs after the optional `id` of the declaration. For example, the following code: From 41ddb2295a15f95ea7e412e57b830589b7e546c4 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Tue, 21 Oct 2025 08:36:36 +0300 Subject: [PATCH 16/18] Replace "thread_local" by "tls" --- Linking.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Linking.md b/Linking.md index 7bacd1e..4e72c4a 100644 --- a/Linking.md +++ b/Linking.md @@ -899,7 +899,7 @@ identifier namespace corresponding to that symbol type. | `binding=` | sets symbol flags according to `` | | `visibility=` | sets symbol flags according to `` | | `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | -| `thread_local` | sets `WASM_SYM_TLS` symbol flag | +| `tls` | sets `WASM_SYM_TLS` symbol flag | | `size=` | sets symbol's `size` appropriately | | `offset=` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | | `name=` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets symbol's `name_len`, `name_data` appropriately | From 6fab7ca50cbaab7d02d3b21a7e15de88009c2789 Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Tue, 21 Oct 2025 23:30:55 +0300 Subject: [PATCH 17/18] Make qualifiers use parens instead of `=` --- Linking.md | 50 +++++++++++++++++++++----------------------------- 1 file changed, 21 insertions(+), 29 deletions(-) diff --git a/Linking.md b/Linking.md index 4e72c4a..639f751 100644 --- a/Linking.md +++ b/Linking.md @@ -894,17 +894,17 @@ identifier namespace corresponding to that symbol type. - `qualifier` is one of the allowed qualifiers on a symbol declaration. Qualifiers may not repeat. -| `` | effect | -|---------------------------|-----------------------------------------------| -| `binding=` | sets symbol flags according to `` | -| `visibility=` | sets symbol flags according to `` | -| `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | -| `tls` | sets `WASM_SYM_TLS` symbol flag | -| `size=` | sets symbol's `size` appropriately | -| `offset=` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | -| `name=` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets symbol's `name_len`, `name_data` appropriately | -| `priority=` | adds symbol to `WASM_INIT_FUNCS` section with the given priority | -| `comdat=` | adds symbol to a `comdat` with the given id | +| `` | effect | +|---------------------|-----------------------------------------------| +| `` | sets symbol flags according to `` | +| `` | sets symbol flags according to `` | +| `retain` | sets `WASM_SYM_NO_STRIP` symbol flag | +| `tls` | sets `WASM_SYM_TLS` symbol flag | +| `(size )` | sets symbol's `size` appropriately | +| `(offset )` | sets `WASM_SYM_ABSOLUTE` symbol flag, sets symbol's `offset` appropriately | +| `(name )` | sets `WASM_SYM_EXPLICIT_NAME` symbol flag, sets symbol's `name_len`, `name_data` appropriately | +| `(init_prio )` | adds symbol to `WASM_INIT_FUNCS` section with the given priority | +| `(comdat )` | adds symbol to a `comdat` with the given id | | `` | flag | |-------------|--------------------------| @@ -914,17 +914,9 @@ identifier namespace corresponding to that symbol type. | `` | flag | |----------------|------------------------------| -| `default` | | +| `default` | 0 | | `hidden` | `WASM_SYM_VISIBILITY_HIDDEN` | -Shorthands may be used in place of full qualifiers: - -| shorthand | resulting qualifier | -|-----------|---------------------| -| `hidden` | `visibility=hidden` | -| `local` | `binding=local` | -| `weak` | `binding=weak` | - - The `priority` qualifier may only be applied to function symbols. - The `size` and `offset` qualifiers may only be applied to data symbols. - The `size` and `name` qualifiers must be applied to data symbols. @@ -945,7 +937,7 @@ occurs after the optional `id` of the declaration. For example, the following code: ```wat -(import "env" "foo" (func (@sym $a retain name="a") (@sym $b hidden name="b") (param) (result))) +(import "env" "foo" (func (@sym $a retain (name "a")) (@sym $b hidden (name "b")) (param) (result))) ``` declares 3 symbols: one primary symbol with the name of the index of the function, one symbol with the name `$a`, and one symbol with the name `$b`. @@ -959,7 +951,7 @@ being defined. For example, a declaration of a 32-bit global with the name `$foo` and linkage name "foo" would look like following: ```wat -(data (i32.const 0) (@sym $foo name="foo" size=4) "\00\00\00\00") +(data (i32.const 0) (@sym $foo (name "foo") (size 4)) "\00\00\00\00") ``` ### Data imports @@ -1018,13 +1010,13 @@ Data segment flags are represented as WebAssembly annotations of the form - `qualifier` is one of the allowed qualifiers on a data segment declaration. Qualifiers may not repeat. -| `` | effect | -|-----------------|------------------------------------------------------| -| `align=` | sets segment's `alignment` appropriately | -| `name=` | sets segment's `name_len`, `name_data` appropriately | -| `strings` | sets `WASM_SEGMENT_FLAG_STRINGS` segment flag | -| `thread_local` | sets `WASM_SEGMENT_FLAG_TLS` segment flag | -| `retain` | sets `WASM_SEG_FLAG_RETAIN` segment flag | +| `` | effect | +|-------------------|------------------------------------------------------| +| `(align )` | sets segment's `alignment` appropriately | +| `(name )` | sets segment's `name_len`, `name_data` appropriately | +| `strings` | sets `WASM_SEGMENT_FLAG_STRINGS` segment flag | +| `tls` | sets `WASM_SEGMENT_FLAG_TLS` segment flag | +| `retain` | sets `WASM_SEG_FLAG_RETAIN` segment flag | If `align` is not specified, it is given a default value of 1. If `name` is not specified, it is given an empty default value. From fd65da7107db6cbce97f19b5af0081f10f390f4f Mon Sep 17 00:00:00 2001 From: feedable <141534996+feedab1e@users.noreply.github.com> Date: Fri, 24 Oct 2025 20:08:04 +0300 Subject: [PATCH 18/18] Edit offset relocations to only point to either functions or segments --- Linking.md | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/Linking.md b/Linking.md index 639f751..2b0b20b 100644 --- a/Linking.md +++ b/Linking.md @@ -762,18 +762,16 @@ Relocations are represented as WebAssembly annotations of the form - `method` describes the type of relocation, so what kind of symbol we are relocating against and how to interpret that symbol. -| `` | symbol kind | corresponding relocation constants | interpretation | -|-------------|-------------|------------------------------------|-----------------------------------| -| `tag` | event* | `R_WASM_EVENT_INDEX_*` | Final WebAssembly event index | -| `table` | table* | `R_WASM_TABLE_NUMBER_*` | Final WebAssembly table index (index of a table, not into one) | -| `global` | global* | `R_WASM_GLOBAL_INDEX_*` | Final WebAssembly global index | -| `func` | function* | `R_WASM_FUNCTION_INDEX_*` | Final WebAssembly function index | -| `functable` | function | `R_WASM_TABLE_INDEX_*` | Index into the dynamic function table, used for taking address of functions | -| `codeseg` | function | `R_WASM_FUNCTION_OFFSET` | Offset into the function body from the start of the function | -| `codesec` | function | `R_WASM_SECTION_OFFSET` | Offset into the function section | -| `datasec` | data | `R_WASM_SECTION_OFFSET` | Offset into the data section | -| `customsec` | N/A | `R_WASM_SECTION_OFFSET` | Offset into a custom section | -| `data` | data | `R_WASM_MEMORY_ADDR_*` | WebAssembly linear memory address | +| `` | symbol kind | corresponding relocation constants | interpretation | +|--------------|-------------|------------------------------------|-----------------------------------| +| `tag` | event* | `R_WASM_EVENT_INDEX_*` | Final WebAssembly event index | +| `table` | table* | `R_WASM_TABLE_NUMBER_*` | Final WebAssembly table index (index of a table, not into one) | +| `global` | global* | `R_WASM_GLOBAL_INDEX_*` | Final WebAssembly global index | +| `func` | function* | `R_WASM_FUNCTION_INDEX_*` | Final WebAssembly function index | +| `functable` | function | `R_WASM_TABLE_INDEX_*` | Index into the dynamic function table, used for taking address of functions | +| `functext` | function | `R_WASM_FUNCTION_OFFSET` | Offset into the function body from the start of the function | +| `customtext` | section | `R_WASM_SECTION_OFFSET` | Offset into a custom section | +| `data` | data | `R_WASM_MEMORY_ADDR_*` | WebAssembly linear memory address | Symbol kinds marked with `*` are considered *primary*. @@ -792,16 +790,13 @@ Symbol kinds marked with `*` are considered *primary*. | nothing | Zero addend | always | | `+` | Positive byte offset | `method` allows addend | | `-` | Negative byte offset | `method` allows addend and `format` is signed | -| `` | Byte offest to label | `method` is either `codeseg` or `*sec` | +| `` | Byte offest to label | `method` is `*text` | - `symbol` describes the symbol against which to perform relocation. - - For `funcsec` relocation method, this is the function id, so that if the + - For `functext` relocation method, this is the function id, so that if the addend is zero, the relocation points to the first instruction of that function. - - For `datasec` relocation method, this is the data segment id, so that if - the addend is zero, the relocation points to the first byte of data in that - segment. - - For `customsec` relocation method, this is the name of the custom section, + - For `customtext` relocation method, this is the name of the custom section, so that if the addend is zero, the relocation points to the first byte of data in that segment. - For other relocation methods, this denotes the symbol in the scope of that