Skip to content

Parallel Workers#40

Merged
chee merged 2 commits into
bump-sdn-experimentalfrom
parallel
Jun 16, 2026
Merged

Parallel Workers#40
chee merged 2 commits into
bump-sdn-experimentalfrom
parallel

Conversation

@expede

@expede expede commented Jun 16, 2026

Copy link
Copy Markdown
Member

Stacked PR over #39

File system processing admits a fair bit of isolated parallelism. In this PR, we spin up 8 workers to avoid bottlenecking a single thread on things like Automerge doc materialisation:

files serial ms shard ms speedup serial maxDrift shard maxDrift RSS serial RSS shard
100 3855 1023 3.8× 2477 ms 0 ms 166 MB 489 MB
500 16255 5063 3.2× 8817 ms 0 ms 378 MB 687 MB
1000 33457 9397 3.6× 19039 ms 84 ms 580 MB 951 MB
2000 67023 17751 3.8× 38997 ms 193 ms 935 MB 1354 MB
4000 139165 39800 3.5× 80513 ms 471 ms 1686 MB 2266 MB

Notice that the serial version is superlinear, due to thread contention preventing async IO from running (because of JS runtime lifecycle).

This uses a fairly naive round-robin strategy instead of work stealing or worker-per-tasks something. The number of workers is fixed up front, so that's a separate scaling factor. Here's different workers for 1000 x 32 KB docs:

workers speedup maxDrift peak RSS
1 (AKA serial) 1.00× 17684 ms 578 MB
2 1.87× 98 ms 770 MB
4 2.73× 117 ms 849 MB
8 3.42× 57 ms 965 MB

Of course much larger or much smaller file systems would perform differently. There's some argument to be made to dynamically scale but also probably YAGNI. Also recall that parallelism can also have its own contention leading to paradoxical slowdowns, so a conservative choice like 8 is probably good for most situations. At our scale, 8 seems reasonable, about 4k files moving higher seems to help but we don't tend to sync >4k docs at a time AFAIK

Worker-count check at 4000 × 32 KB — the knee moved
            1000 files            4000 files
 workers    speedup               speedup
   8        3.42×                 2.87×
  16        ~3.5×  (≈ 8, plateau) 4.22×  ← best
  23        —                     3.52×  (regresses)

@expede expede marked this pull request as draft June 16, 2026 03:50
@expede expede changed the base branch from main to bump-sdn-experimental June 16, 2026 03:50
@expede expede changed the title Parallel Parallel Workers Jun 16, 2026
@expede expede marked this pull request as ready for review June 16, 2026 04:48
@expede expede self-assigned this Jun 16, 2026
@chee chee merged commit 278310b into bump-sdn-experimental Jun 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants