Skip to content

Commit ee8452b

Browse files
Multiple stable memory regions (and a new primitive type Region). (#3768)
Runtime-system implementation of the API described in dfinity/motoko-base#516 Checklist: - [x] pass existing tests for existing (experimental) stable memory, using region0 as that stable memory. - [x] distinct regions are isolated (see `test/run-drun/stable-regions-are-isolated.mo`). - [x] serialize/deserialize `Region` type. - [x] migration path for existing (experimental) stable memory, into region0 of new region manager. - [x] trap when Region.new fails? - [x] add version to metadata - [x] basic perf test - [x] maybe remove (most of) Region0.rs - [x] align errors - [ ] improve test region0/stable-mem-big-blog.mo to check contents of large read/writes (done in halves) - [ ] restrict access to new somehow (can be done later, but nicer to have on release). - [x] rts_stable_mem_size? - [x] max-stable-pages (limits physical pages, i.e. include meta-data) - [ ] registers not stack allocation? - [x] clean up asserts on tag ranges - [x] lazy metadata allocation? - [x] add Region.mo to next_moc - [x] document Region.mo - [ ] perf maybe: turn ic_mem_fns into ordinary externs where poss - affects perf IIRC from experience of @luc-blaeser (NOTE, I couldn't see this myself after a simple experiment) - [x] elim Region0.rs operations wrappers and just do the conditional compilation in compile.ml - bench shows inlining should be a win. - [ ] simplify chunked read writes to iterate over vec_pages entries directly should be cheaper. - [x] specify Region type in manual TODO separately: - [ ] add stable-regions.md to portal doc (sidebars.js) - [ ] gracefully fail compilation of stablemem/region ops by generating trapping functions - [ ] protect access to Region.new See also: - [Forum post, for community discussion.](https://forum.dfinity.org/t/motoko-stable-regions/19182) - **StableBTree** edits needed to upgrade to this system: - sardariuss/MotokoStableBTree#4 - sardariuss/MotokoStableBTreeTest#1
1 parent bbe8c25 commit ee8452b

File tree

150 files changed

+9083
-806
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

150 files changed

+9083
-806
lines changed

.github/workflows/test.yml

+5-3
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,10 @@ jobs:
107107
if: github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository && contains(github.event.pull_request.labels.*.name, 'build_artifacts')
108108
needs: tests
109109
concurrency: ci-${{ github.ref }}
110-
runs-on: ubuntu-latest
110+
strategy:
111+
matrix:
112+
os: [ ubuntu-latest, macos-latest ]
113+
runs-on: ${{ matrix.os }}
111114
steps:
112115
- uses: actions/checkout@v4
113116
- uses: cachix/install-nix-action@v23
@@ -123,7 +126,6 @@ jobs:
123126
- name: upload artifacts
124127
uses: actions/upload-artifact@v3
125128
with:
126-
name: moc
129+
name: moc-${{ matrix.os }}
127130
path: ${{ env.UPLOAD_PATH }}
128131
retention-days: 5
129-

Changelog.md

+5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Motoko compiler changelog
22

3+
* motoko (`moc`)
4+
5+
* Added a new stable `Region` type (#3768) of dynamically allocated, independently growable and
6+
isolated regions of IC stable memory. See documentation.
7+
38
## 0.9.8 (2023-08-11)
49

510
* motoko (`moc`)

design/StableRegions.md

+333
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,333 @@
1+
# Stable Region API
2+
3+
See StableMemory.md for context of the current experimental API.
4+
5+
This document aims to specify the API and memory representations for a generalization
6+
of this API that permits multiple isolated _regions_ of stable memory, where each can be
7+
grown independently.
8+
9+
The **region manager** is the state and logic to support this generalization.
10+
11+
12+
## Role for "stable regions" in Motoko
13+
14+
The current stable memory module in `base` has been "experimental" for a long time, and requires a more composable API to graduate from this status.
15+
16+
Stable regions address the problem that today's `ExperimentalStableMemory` module only provides a single, monolithic memory that makes it unsuitable for directly building composable software parts.
17+
18+
Stable regions permit a new API that supports composable use cases.
19+
20+
Stable regions also bring Motoko closer to parity with Rust canister development support today, by giving a run-time-system-based analog of a special Rust library for stable data structures that allocates “pages” for them from stable memory in separate, isolated, memory regions.
21+
22+
23+
## Design space
24+
25+
The design space for the page allocator is defined by at least two
26+
tensions:
27+
28+
1. fully-stable representation of allocator meta data **versus** fast load/store operations.
29+
30+
2. Total scaling capacity **versus** minimum footprint for meta data.
31+
32+
33+
**Tension 1** is introduced because we want to avoid relying on the Motoko heap as the "ground truth" about the allocator's state. If this heap is lost, as it is during an upgrade, then a developer may still want to recover all of the regions' data and meta data.
34+
35+
Tension 1 is resolved by storing the ground truth in stable memory, keeping it in sync with heap structures that permit faster access operations.
36+
37+
Compared with the Rust version, we store enough extra meta data to permit:
38+
39+
- Regions whose page blocks are in arbitrary order, not
40+
necessarily in order of smallest to highest address.
41+
42+
- 2^64-1 Regions max (instead of 255 Regions max).
43+
Due to the limit on blocks, only 2^16-1 can have non-zero page size.
44+
45+
We want to permit arbitrary page block orders to make a smooth
46+
transition to region reclamation and re-allocation in the near
47+
future, with potential integration into the Motoko GC. The
48+
extra complexity is modest, and seems "worth" the cost.
49+
50+
We change the maximum region limit because 255 may be too small in
51+
some extreme cases and incompatible with GC.
52+
Instead, we can freely allocate new regions, recycling blocks, but not
53+
Region ids. The id of a Region is invariant and will not change, even with GC.
54+
55+
We address the question of whether the new limit of 32k regions is
56+
"enough" in the Q&A section (it is, for all practical purposes)
57+
58+
59+
**Tension 2** is introduced because we want a design that will continue
60+
to work even when canisters can store more stable data than today (32GB).
61+
62+
Tension 2 is resolved by making prudent representation choices.
63+
64+
The representations we choose for regions and region identifiers
65+
permit a scaling to 256GB of stable data while still permitting meta
66+
data to be repeated in both stable and non-stable arenas. These are
67+
the same limits imposed by the Rust implementation, for the same
68+
reasons. See Q&A for more discussion.
69+
70+
71+
## Definitions and constants
72+
73+
- a **page** is 65536 bytes.
74+
- a **page block** is a contiguous sequence of 128 pages (~8MB).
75+
- a **page block index** is a 16 bit, index-based identifier for a page block.
76+
- a **region** is a sequence of (generally non-contiguous) **page blocks**.
77+
- the maximum number of page blocks is 32768.
78+
- the maximum amount of stable memory for all regions is 256GB.
79+
80+
81+
## Questions and answers
82+
83+
### Q: What determines the 8MB non-empty region minimum?
84+
85+
Mostly, we want to reduce the amount of metadata we need to track, so instead of per-page metadata (lots) we only need per-block metadata (less).
86+
This size means we grow a region by more than one physical page at
87+
a time (in terms of the way that the canister interacts with the
88+
system API, at least). Rather than actually grow by a single page,
89+
the implementation grows by a "page block" (8MB) at a time.
90+
91+
This choice means that there are 128 pages per page block, and that
92+
the maximum number of regions and region blocks are each relatively
93+
small (32k each). Consequently, they can each be identified with a
94+
2-byte identifier, and we can pre-allocate tables to store certain
95+
relations about them, which is critical.
96+
97+
### Q: Are 32767 regions enough?
98+
99+
A: Permitting more than 32k regions may seem theoretically
100+
interesting, but is not practical given other parameters today that
101+
control the minimal footprint of a region (8MB) and dictate the
102+
maximum size of stable memory for a canister today (32GB). With 32k
103+
regions at 8MB each, well over the maximum stable memory size is used
104+
(256GB compared to 32GB, the limit today)
105+
106+
### Q: When is stable memory becoming less costly?
107+
108+
Spring 2023.
109+
110+
### Q: How does the cheaper stable memory access cost compare with ordinary heap memory access cost?
111+
112+
2-3x slower than ordinary heap memory.
113+
114+
115+
## Design details
116+
117+
### API
118+
119+
Internal region allocator operations:
120+
121+
- `initialize` -- called by the RTS, not by the Motoko developer.
122+
123+
User-facing region allocator operations:
124+
125+
- `region_new` -- create a dynamic region.
126+
- `region_grow` -- grow region by a specified number of pages.
127+
- `region_load` -- read some data from the region.
128+
- `region_store` -- store some data into the region.
129+
130+
### FUTURE WORK
131+
132+
Add a special operation, for testing our design for future GC integration (bonus):
133+
134+
- `region_release` -- release region and reuse region's page blocks.
135+
136+
The `_release` operation is *not* part of the user-facing API nor part of the MVP,
137+
but supporting it is important because it means we can transition quickly to an integration
138+
with the ambient Motoko GC if we can support it.
139+
140+
Another special operation, for disaster recovery:
141+
142+
- `rebuild` -- not needed unless we need to recreate all Region objects from their stable-memory counterparts.
143+
144+
145+
## Internal footprint
146+
147+
The state of the allocator is stored in a combination of:
148+
149+
- stable memory fields and tables and
150+
- stable heap memory, in the form of objects of opaque type `Region`.
151+
152+
The stable memory state is sufficient to reconstitute the stable heap objects
153+
(`rebuild` operation, described in a subsection below).
154+
155+
That means that even if the stable parts of the heap are lost, the
156+
stable memory state can fully describe the region objects that will be rebuilt when it succeeds.
157+
158+
### stable memory fields
159+
160+
- total allocated blocks, `u16`, max value is `32768`.
161+
- total allocated regions, `u64`, max value is 2^64-1 (one region is reserved for "no region" in block-region table).
162+
- The `block` table (fixed size, about 6 pages).
163+
164+
### representation of values of type `Region`
165+
166+
- A singleton, heap-allocated object with mutable fields.
167+
- While being heap-allocated, the object is also `stable` (can be stored in a `stable var`, etc).
168+
- `RegionObject { id_lower: u32, id_upper: u32; mut page_count: u32; mut vec_pages: Value }`
169+
- Fields id_lower (lower 32-bits) and id_upper (upper 32-bits) gives the Region's numerical 64-bit (id = (id_upper << 32 | id_lower)).
170+
- Field `page_count` gives the number of pages allocated to the Region.
171+
- Field `vec_pages` points at a heap-allocated `Blob` value, and it works with `page_count`
172+
to represent a growable vector that we call the region's **"access
173+
vector"** (because "blocks vector" sounds a bit strange, and its
174+
used to support O(1) access operations):
175+
- the access vector has `vec_capacity` slots.
176+
- each slot is a `u16`.
177+
- the first `page_count + 127 / 128` slots contain a valid page block ID for the region.
178+
- during an upgrade, the access vectors get serialized and deserialized as data `Blobs`.
179+
180+
181+
### region-blocks relation
182+
183+
The `region-blocks` relation is not materialized into a table in stable memory (doing so with a pre-allocated table would be prohibitively large).
184+
185+
Instead, this relation is represented in two ways at the same time:
186+
1. by the set of heap-allocated region objects, and their access vectors. The access vectors provide O(1) store and load support.
187+
2. by the `block-region` table, which together are sufficient to recreate all of the heap-allocated region objects.
188+
189+
In ordinary operation, the second feature is not required. In the event of an upgrade failure, however, it could be vital (See `rebuild`).
190+
191+
### block-region table
192+
193+
- purpose:
194+
- relate a block ID ("page block ID") to its region (if any), its position (or rank) in that region (see `rebuild`) and its current size in (used) pages (<=128).
195+
All but the last block owned by a region should have all pages 128 allocated.
196+
197+
- NB: The organization of this table is not useful for efficient
198+
access calculations of load or store (they require a linear
199+
search that would be prohibitively slow). OTOH, this
200+
table is suitable to do a "batch" rebuild of the dynamic heap-allocated vectors
201+
in that table, if need be (see `rebuild`).
202+
203+
- 32768 entries (statically sized).
204+
- 8 (id) +2 (rank) + 1 (used) = 11 bytes per entry.
205+
- entry type = `BlockRegion { region : u64, position : u16, size: u8 }`
206+
- the location of each entry gives its corresponding block ID.
207+
208+
209+
### Overview of `rebuild`
210+
211+
When upgrades work as expected, stable `Regions` are serialized and deserialized just like other stable data.
212+
213+
For disaster recovery, we can **also** rebuild all of the region objects from data in stable memory.
214+
215+
We use the `block-region` tables in stable memory to rebuild the regions' objects:
216+
217+
- The `block-region` table gives a relative position and region ID for each block ID together with utilized page count.
218+
219+
Once each regions' vectors have been sized (by a linear scan of block-region, summing block sizes) and allocated, the block-region table says how to fill them, one entry at a time.
220+
Unlike the Rust design, vector entries can be populated out-of-order.
221+
222+
Currently, we need only recover region 0 (when upgrading).
223+
224+
225+
### Special (reserved) regions
226+
227+
- Region 0 -- Anonymous region, for supporting the legacy API that we have today, which lacks `Region` values.
228+
- Region 1 -- "Reclaimed blocks" region that consists of blocks reclaimed from GC'd regions.
229+
- Regions 2-15 -- Future use by Motoko RTS (TBD).
230+
231+
### Overview of GC support (future work)
232+
233+
- Regions are represented (see special subsection) with heap objects that are `stable`, but mutable.
234+
- They have special GC headers to recognize their special structure.
235+
- The block-region table (or a more transient bitmap) keeps track of which blocks are in use as Region heap values are GC'd.
236+
237+
Blocks can be marked during normal GC, with unmarked blocks returned to a transient free-list. In this design, blocks are recycled during
238+
the lifetime of a single canister version.
239+
240+
Alternatively, Blocks can be marked only during deserialization after an upgrade, for bespoke, Region-only, GC during upgrades, with unmarked blocks
241+
returned to a free list.
242+
In this design, blocks are only recycled during upgrade from one version to the next, meaning long-lived canisters that create garbage regions will leak
243+
space.
244+
245+
### Migration from earlier designs into region system
246+
247+
#### Version overview
248+
249+
Including this design, there are three possible verions (`0`, `1`, or `2`):
250+
251+
0. Stable vars only.
252+
1. Stable vars *plus* direct access to IC0 API, including `grow`.
253+
This access is exposed via the Experimental Stable Memory API.
254+
2. This Region system, where direct access still works through region zero.
255+
256+
257+
#### Key points
258+
259+
- Version 0:
260+
- will never be forced to "migrate" to other versions (assuming no stable memory API use).
261+
- will never incur the space overhead of the region system.
262+
263+
- Migration from 0 to version 1 occurs when:
264+
- Experimental Stable Memory function `grow` is invoked for the first time.
265+
This will not incur the space overhead of the region system.
266+
267+
- Migration from version 0 or version 1 to version 2 occurs when:
268+
- An initial region is allocated via `Region.new`.
269+
This will incur the space overhead of the region system.
270+
The space overhead is 16 pages (1MB) when migration from version 0, and 128 pages (8MiB) when migrating from version 1.
271+
272+
#### Compiler flag
273+
274+
Compiler flag
275+
276+
```
277+
--stable-regions
278+
```
279+
280+
Affects upgrades only and forces migration directly into version 2 from version 0 or 1.
281+
It is provided for testing purposes and *not* required for use of regions.
282+
283+
#### Format version details
284+
285+
The first 32 bits of stable memory record a "marker," which indicates how to determine the "version number"
286+
for Motoko stable memory. This version number is stored either:
287+
- *implicitly*, when the marker is non-zero, and version is `0`.
288+
- *explicitly*, when the marker is zero, and version is stored elsewhere (but currently always `1`).
289+
290+
Including this design, there are three possible verions (`0`, `1`, or `2`). See preceeding section.
291+
292+
In the first cases (`0`) and (`1`), we wish to *selectively* migrate into the region system (`2`), with its own internal versioning.
293+
294+
295+
#### Opt-in mechanism
296+
297+
The opt-in mechanism for using version 2 consists of using:
298+
299+
- dynamically calling `Region.new()` form a canister currently in version 0 or 1;
300+
- staticly specifying compiler flag `--stable-regions`. This is mostly useful for testing.
301+
302+
Critically,
303+
304+
1. The use of physical stable memory is pay-as-you-go: canisters that do not use regions do not pay for that priviledge.
305+
2. There is no provision for "downgrading" back to earlier, pre-region systems.
306+
307+
##### Version 0 migration.
308+
309+
To migrate from version 0 to version 2, there is nothing additional to do for existing data.
310+
311+
The region system detects this case by measuring the zero-sized stable memory during its initialization and
312+
starts allocating blocks from address 16*2^16 (1MiB overhead), leaving 10 pages unused for future use.
313+
314+
##### Version 1 migration.
315+
316+
Migrating version 1 stable memory renames it as "region 0" in Version 2.
317+
318+
Critically, to migrate from version 1, we must perserve existing data, but reorganize it slightly.
319+
320+
In effect, all existing data will retain its logical order as (pre-allocated) region 0.
321+
322+
To accomodate the meta data of the region system, we move the first block of region 0, physically.
323+
324+
Then, we reformat the first block of stable memory as the region meta data block.
325+
326+
The rest of the blocks become part of region 0, with its first block stored at the end of physical memory.
327+
328+
The region system starts allocating blocks from address 128*2^16 (8MiB overhead), leaving 122 pages unused for future use.
329+
330+
Since we do not move the remaining blocks of region 0, the first block of memory (excluding meta-data) is unused space.
331+
332+
This design ensures that an existing canister using very large amounts of experimental stable memory can be migrated with only constant-cost movement
333+
of the first block (128 pages) of memory.

doc/md/base/ExperimentalStableMemory.md

+6-1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@ Byte-level access to (virtual) _stable memory_.
55
and may be replaced by safer alternatives in later versions of Motoko.
66
Use at your own risk and discretion.
77

8+
**DEPRECATION**: Use of `ExperimentalStableMemory` library may be deprecated in future.
9+
Going forward, users should consider using library `Region.mo` to allocate *isolated* regions of memory instead.
10+
Using dedicated regions for different user applications ensures that writing
11+
to one region will not affect the state of another, unrelated region.
12+
813
This is a lightweight abstraction over IC _stable memory_ and supports persisting
914
raw binary data across Motoko upgrades.
1015
Use of this module is fully compatible with Motoko's use of
@@ -30,7 +35,7 @@ NB: The IC's actual stable memory size (`ic0.stable_size`) may exceed the
3035
page size reported by Motoko function `size()`.
3136
This (and the cap on growth) are to accommodate Motoko's stable variables.
3237
Applications that plan to use Motoko stable variables sparingly or not at all can
33-
increase `--max-stable-pages` as desired, approaching the IC maximum (initially 8GiB, then 32Gib, currently 48Gib).
38+
increase `--max-stable-pages` as desired, approaching the IC maximum (initially 8GiB, then 32Gib, currently 64Gib).
3439
All applications should reserve at least one page for stable variable data, even when no stable variables are used.
3540

3641
Usage:

0 commit comments

Comments
 (0)