Data size SKU for billing by gefjon · Pull Request #2098 · clockworklabs/SpacetimeDB

gefjon · 2025-01-08T17:10:06Z

Description of Changes

This PR adds the ability to compute and report the number of rows in memory, and the number of bytes used by those rows.

Part of https://github.com/clockworklabs/SpacetimeDBPrivate/issues/1229 .

Currently, reporting is accomplished by a new group of metrics, all of which are prefixed with spacetime_data_size.

Specifically, for each database, we report:

blob_store_num_blobs, the number of bytes used by large blobs in the BlobStore.
- We just report the actual size in bytes of the blobs.
  - Strings' sizes are len(str).
  - Other blobs' sizes are the BSATN size.
- Duplicated blobs are counted multiple times, despite their being deduplicated by hash in the storage layer. I believe this to be more predictable to clients, since it preserves the property that inserting a row which contains a large blob causes a string increase in the metric, and deleting such a row causes a strict decrease.
blob_store_bytes_used_by_blobs, the number of blobs in the BlobStore.
- The flat overhead per blob is likely negligible compared to the data cost of those blobs, given that they start at 1 KiB and the overhead is likely on the order of 64 bytes, but we measure and report it for completeness.
- As with blob-bytes, this counts duplicated blobs multiple times.

For each table in each database, we report:

table_num_rows, the number of rows in the table.
Flat overhead here may be more meaningful, since many tables have quite small row sizes.
table_bytes_used_by_rows, the number of bytes in Pages used by rows in the table.
- This is the sum of the fixed-len size and the granules allocated to store var-len portions for the rows.
- Granules are always 32 bytes long; fixed-len size is a per-table constant.
- Overhead due to padding, "pointer-like" contents, &c is included.
- Size of large blobs is not included here (see above).
- Free space in Pages is also not included here.
table_num_rows_in_indexes, the number of rows in indexes in the table.
- We could hypothetically partition this by index. That wouldn't be very interesting, though, because it's just table_size * num_indices.
- Again, flat overhead potentially matters here.
table_bytes_used_by_index_keys, the bytes used to store keys in indexes in the table.
- We could hypothetically partition this by index. Unlike above, that would potentially be interesting.
- See doc comments on the new KeySize trait for a precise definition of this metric.

In this PR, the new metrics are reported when committing a mutable TX, as that's when their values change. However, it's not necessary to report them this often; unlike our existing metrics they are not incremental. (Or the incremental maintenance is confined to the table crate, and not visible to the core crate where they are read and reported.) It would be reasonable to report them every N transactions for some choice of N, or every t seconds for some choice of t, or in response to an external request, or in any number of ways.

API and ABI breaking changes

N/a, unless adding metrics breaks Prometheus in some way I don't understand.

Expected complexity level and risk

3: it would be unfortunate if we misreported these, since we intend to use them for billing, and the computations for some of the new metrics are non-trivial. It's also possible (I haven't checked) that reporting the benchmarks will have meaningful overhead, causing a performance regression. That said, these changes are very unlikely to break any existing functionality.

Testing

Wrote proptests that the low-level measurements of number and bytes used by rows and indexes is as expected, compared to an obviously correct (IMO) naiive implementation.
I am not sure how to test these manually, as I don't know how to run locally while collecting Prometheus metrics.

Definition of `Page::bytes_used_by_rows` to follow. This change seemed to stand on its own enough to deserve a separate commit.

We intend to bill based on these predictable metrics, rather than the somewhat-unpredictable actual heap memory usage of the system. As such, we need a way to compute them (duh). This commit adds `Table` methods for computing the number of resident rows, and the number of bytes stored by those rows.

Centril · 2025-01-08T19:01:13Z

Still a draft, but the overall strategy here makes sense. :)

Per out-of-band discussion, I am not sure this computation will actually be useful to us, but it is the thing I can compute at this time. See comment on `BTreeIndex::num_key_bytes` in btree_index.rs for the specific counting implemented here.

jsdt

This looks good overall. Adding testing is important if we are going to use these for billing.

Is it possible to write a function that computes the size of a row s.t. we can assert that bytes_used_by_rows() is equal to the sum of the size of each row? If so, that would give us a good path toward being able to write tests.

Slow reconstructions of `num_rows` and `bytes_used_by_rows`. Still to follow: index usage reporting.

gefjon · 2025-01-27T21:51:26Z

@jsdt I have written some unit proptests in table.rs, row_size_reporting_matches_slow_implementations, index_size_reporting_matches_slow_implementations_single_column and ibid _two_column. I have left comments that I am not sure there is a meaningful way to proptest the blob store reporting, since it's already as stupid and obvious an impl as possible. Is there anything else you'd like to see?

jsdt

These tests look good, thanks for adding them. For the blob store, do you think recomputing it every time we report is going to be a performance issue? If so, it might be worth keeping track of it as modifications happen.

gefjon · 2025-01-28T02:16:11Z

These tests look good, thanks for adding them. For the blob store, do you think recomputing it every time we report is going to be a performance issue? If so, it might be worth keeping track of it as modifications happen.

I would assume that the majority of modules have very small blob stores. If this turns out not to be the case we could easily memoize it in the same way as this PR is doing for other measures.

jsdt

This looks good to me. I'll defer to @Centril on whether there are some benchmarks that should be run to check for a perf regression before merging.

gefjon added 2 commits January 8, 2025 12:02

Page: track number of allocated var-len granules

36ee587

Definition of `Page::bytes_used_by_rows` to follow. This change seemed to stand on its own enough to deserve a separate commit.

Centril self-requested a review January 8, 2025 19:00

gefjon added 5 commits January 14, 2025 12:40

Operator to compute index data size

66a4270

Per out-of-band discussion, I am not sure this computation will actually be useful to us, but it is the thing I can compute at this time. See comment on `BTreeIndex::num_key_bytes` in btree_index.rs for the specific counting implemented here.

Move KeySize to its own file; export and document it

d49dd80

Blob store usages; hook up index usages

c7130ee

clippy

c4e35cb

Add and report data size metrics for CommittedState

4eb083c

gefjon marked this pull request as ready for review January 17, 2025 15:56

gefjon requested a review from jsdt January 17, 2025 15:56

Merge remote-tracking branch 'origin/master' into phoebe/data-size-sku

b663cd0

joshua-spacetime mentioned this pull request Jan 17, 2025

Track query and datastore cpu usage metrics #2140

Merged

jsdt reviewed Jan 20, 2025

View reviewed changes

Comment thread crates/table/src/btree_index.rs

Comment thread crates/core/src/db/datastore/locking_tx_datastore/committed_state.rs

Comment thread crates/core/src/db/datastore/locking_tx_datastore/datastore.rs

bfops added release-1.0 backward-compatible labels Jan 21, 2025

gefjon added 2 commits January 24, 2025 11:58

Merge remote-tracking branch 'origin/master' into phoebe/data-size-sku

97f7685

First pass at testing

f938df2

Slow reconstructions of `num_rows` and `bytes_used_by_rows`. Still to follow: index usage reporting.

gefjon self-assigned this Jan 27, 2025

gefjon added 2 commits January 27, 2025 14:44

Test that single-column indexes report usage as expected

7b4a4e5

Also test for two-column indexes

93cbda0

jsdt reviewed Jan 28, 2025

View reviewed changes

Add TODO note in response to jeff's comment

8190625

gefjon assigned jsdt Jan 28, 2025

jsdt approved these changes Jan 28, 2025

View reviewed changes

PMerge remote-tracking branch 'origin/master' into phoebe/data-size-sku

752bbc5

gefjon added this pull request to the merge queue Feb 3, 2025

Merged via the queue into master with commit d839657 Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data size SKU for billing#2098

Data size SKU for billing#2098
gefjon merged 14 commits intomasterfrom
phoebe/data-size-sku

gefjon commented Jan 8, 2025 •

edited

Loading

Uh oh!

Centril commented Jan 8, 2025

Uh oh!

jsdt left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gefjon commented Jan 27, 2025

Uh oh!

jsdt left a comment

Uh oh!

gefjon commented Jan 28, 2025

Uh oh!

jsdt left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gefjon commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of Changes

API and ABI breaking changes

Expected complexity level and risk

Testing

Uh oh!

Centril commented Jan 8, 2025

Uh oh!

jsdt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gefjon commented Jan 27, 2025

Uh oh!

jsdt left a comment

Choose a reason for hiding this comment

Uh oh!

gefjon commented Jan 28, 2025

Uh oh!

jsdt left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gefjon commented Jan 8, 2025 •

edited

Loading