encoding benchmarks #79
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Choosing the Right Encoding for Networking
1. Introduction
Nomos currently uses bincode for all serialization. While fast and simple, bincode lacks support for schema evolution and cross-language compatibility—two features increasingly important as the ecosystem scales.
2. Selection Criteria
We evaluated candidate encoding formats based on:
3. Format Comparison (P2P-Relevant Formats)
4. Tooling Considerations
Serde-Generate
https://crates.io/crates/serde-generate
Auto-generates serializers for Rust types, but fails with complex constructs like
Risc0LeaderProofdue to issues with thetracingfeature. Adding schemas manually is also non-trivial.Canonical Protobuf in Rust
https://crates.io/crates/prost
We can use
prostwith deterministic configurations:Manual canonicalization wrapper:
5. Why Evolution Matters
Rigid encodings like bincode break when structures evolve. Nomos may support thousands of clients, making coordinated upgrades costly. Formats that support optional fields and schema evolution reduce this friction.
6. Planning for Multi-Language Support
Support for multiple implementations matters most post-adoption. Initially, most ecosystems rely on one client. However, protocol-level encoding should avoid locking into a single ecosystem from the start.
7. Serialization Benchmark Analysis
Benchmark Overview
This analysis compares 8 serialization formats: Bincode, Borsh, CBOR, JSON, MessagePack, Protobuf, SCALE, and SSZ. The benchmarks test performance on different data types and scales.
Top Observations
Simple Structs Serialization (Roundtrip Performance)
Binary Structs Performance
Large Structs Performance
Performance Scaling Characteristics
8. Encoding Evaluation and Conclusion
Benchmark results show meaningful performance differences across formats, especially in large and nested structures. While bincode leads in raw speed for small data, SCALE and Borsh outperform others in large-structure throughput.
Good alternatives to consider:
Each has trade-offs. The decision depends on the need for schema evolution, performance, cross-language support, and implementation simplicity.
9. How to run benchmarks
To run the benchmarks, clone the repository and execute: