logos-co · andrussal · Jun 30, 2025
diff --git a/wire_encodings/Cargo.toml b/wire_encodings/Cargo.toml
@@ -0,0 +1,29 @@
+[package]
+name = "wire_encodings"
+version = "0.1.0"
+edition = "2024"
+
+[dependencies]
+serde = { version = "1.0", features = ["derive"] }
+bincode = { version = "1.3" }
+prost = "0.14.1"
+rmp-serde = "1.1"
+serde_cbor = "0.11"
+parity-scale-codec = { version = "3.6", features = ["derive"] }
+borsh = { version = "1.0", features = ["derive"] }
+serde_json = "1.0"
+
+
+ethereum_ssz = "0.9.0"
+ethereum_ssz_derive = "0.9.0"
+
+hex = "0.4"
+
+[dev-dependencies]
+criterion = { version = "0.6.0", features = ["html_reports"] }
+
+
+[[bench]]
+name = "encodig_benchmarks"
+path = "benches/benchmark.rs"
+harness = false
diff --git a/wire_encodings/README.md b/wire_encodings/README.md
@@ -0,0 +1,162 @@
+# Choosing the Right Encoding for Networking
+
+## 1. Introduction
+
+Nomos currently uses **bincode** for all serialization. While fast and simple, bincode lacks support for **schema evolution** and **cross-language compatibility**—two features increasingly important as the ecosystem scales.
+
+## 2. Selection Criteria
+
+We evaluated candidate encoding formats based on:
+
+* **Schema Evolution** – Supports changes to data structures without breaking compatibility.
+* **Security** – Handles untrusted inputs safely.
+* **Performance** – Fast to serialize and deserialize.
+* **Determinism** – Always produces identical output for the same input.
+* **Cross-language Compatibility** – Available and maintained across multiple languages.
+
+## 3. Format Comparison (P2P-Relevant Formats)
+
+| Encoding Format | Evolution | Deterministic | Security | Lang Support | Usage                 |
+| --------------- | --------- | ------------- | -------- | ------------ | --------------------- |
+| **Protobuf**    | ✓         | △             | ⬤⬤◯      | ⬤⬤⬤          | Cosmos SDK, libp2p    |
+| **Borsh**       | —         | ✓             | ⬤⬤⬤      | ⬤⬤◯          | NEAR, Solana programs |
+| **SCALE**       | —         | ✓             | ⬤⬤◯      | ⬤◯◯          | Polkadot              |
+| **Bincode**     | —         | ✓             | ⬤⬤◯      | ⬤◯◯          | Solana validators     |
+| **CBOR**        | ✓         | △             | ⬤⬤◯      | ⬤⬤⬤          | Cardano, IPFS         |
+| **MsgPack**     | △         | △             | ⬤◯◯      | ⬤⬤⬤          | Algorand              |
+| **SSZ**         | —         | ✓             | ⬤⬤⬤      | ⬤◯◯          | Ethereum 2.0          |
+
+## 4. Tooling Considerations
+
+### Serde-Generate
+
+[https://crates.io/crates/serde-generate](https://crates.io/crates/serde-generate)
+
+Auto-generates serializers for Rust types, but fails with complex constructs like `Risc0LeaderProof` due to issues with the `tracing` feature. Adding schemas manually is also non-trivial.
+
+### Canonical Protobuf in Rust
+
+[https://crates.io/crates/prost](https://crates.io/crates/prost)
+
+We can use `prost` with deterministic configurations:
+
+```rust
+// In build.rs - use BTreeMap for consistent key ordering
+let mut config = prost_build::Config::new();
+config.btree_map(&["."]); // Apply to all map fields
+config.compile_protos(&["your.proto"], &["."])?;
+```
+
+Manual canonicalization wrapper:
+
+```rust
+use prost::Message;
+
+fn encode_with_prost<T: Message>(msg: &T) -> Vec<u8> {
+    // Standard prost encoding. Output is deterministic if:
+    // - map fields use BTreeMap via build config
+    // - .proto field tag numbers match declaration order
+    // - unknown or unset optional fields are avoided
+    let mut buf = Vec::new();
+    msg.encode(&mut buf).unwrap();
+    buf
+}
+```
+
+## 5. Why Evolution Matters
+
+```rust
+// V1
+struct Block { height: u64, hash: [u8; 32] }
+
+// V2
+struct Block { height: u64, hash: [u8; 32], timestamp: u64 }
+```
+
+Rigid encodings like bincode break when structures evolve. Nomos may support thousands of clients, making coordinated upgrades costly. Formats that support optional fields and schema evolution reduce this friction.
+
+## 6. Planning for Multi-Language Support
+
+Support for multiple implementations matters most post-adoption. Initially, most ecosystems rely on one client. However, protocol-level encoding should avoid locking into a single ecosystem from the start.
+
+## 7. Serialization Benchmark Analysis
+
+### Benchmark Overview
+
+This analysis compares 8 serialization formats: Bincode, Borsh, CBOR, JSON, MessagePack, Protobuf, SCALE, and SSZ. The benchmarks test performance on different data types and scales.
+
+### Top Observations
+
+* Bincode offers best performance for small and medium-sized data
+* SCALE and Borsh perform best on large structures
+* Protobuf offers compact encoding size and moderate throughput
+
+### Simple Structs Serialization (Roundtrip Performance)
+
+| Encoding Format | Roundtrip Time | Size (bytes) | Throughput (Melem/s) |
+| --------------- | -------------- | ------------ | -------------------- |
+| Bincode         | 607.66 ns      | 12           | 82.28                |
+| Borsh           | 775.14 ns      | 12           | 64.51                |
+| SSZ             | 947.31 ns      | 12           | 52.78                |
+| SCALE           | 1.06 µs        | 10           | 47.31                |
+| Protobuf        | 1.21 µs        | 9            | 41.24                |
+| MessagePack     | 2.12 µs        | 13           | 23.63                |
+| JSON            | 3.74 µs        | 31           | 13.36                |
+| CBOR            | 5.70 µs        | 22           | 8.77                 |
+
+### Binary Structs Performance
+
+| Encoding Format | Roundtrip Time | Size (bytes) | Throughput (Melem/s) |
+| --------------- | -------------- | ------------ | -------------------- |
+| Bincode         | 1.59 µs        | 13           | 31.55                |
+| Borsh           | 1.81 µs        | 9            | 27.64                |
+| SCALE           | 1.83 µs        | 6            | 27.25                |
+| Protobuf        | 2.66 µs        | 7            | 18.81                |
+| MessagePack     | 3.19 µs        | 7            | 15.66                |
+| SSZ             | 4.27 µs        | 9            | 11.70                |
+| CBOR            | 4.82 µs        | 12           | 10.37                |
+| JSON            | 4.99 µs        | 20           | 10.03                |
+
+### Large Structs Performance
+
+| Encoding Format | Roundtrip Time | Size (bytes) | Throughput (Kelem/s) |
+| --------------- | -------------- | ------------ | -------------------- |
+| SCALE           | 2.85 µs        | 647          | 1,755                |
+| Borsh           | 3.05 µs        | 700          | 1,637                |
+| Protobuf        | 4.81 µs        | 643          | 1,041                |
+| Bincode         | 5.29 µs        | 772          | 946                  |
+| SSZ             | 8.57 µs        | 589          | 583                  |
+| CBOR            | 18.13 µs       | 720          | 276                  |
+| MessagePack     | 18.45 µs       | 618          | 271                  |
+| JSON            | 27.11 µs       | 1,318        | 184                  |
+
+### Performance Scaling Characteristics
+
+| Format      | Simple Structs | Binary Structs | Large Structs | Overall Scalability |
+| ----------- | -------------- | -------------- | ------------- | ------------------- |
+| Bincode     | ⬤⬤⬤            | ⬤⬤⬤            | ⬤⬤◯           | Strong              |
+| SCALE       | ⬤⬤◯            | ⬤⬤⬤            | ⬤⬤⬤           | Strong              |
+| Borsh       | ⬤⬤◯            | ⬤⬤◯            | ⬤⬤⬤           | Strong              |
+| Protobuf    | ⬤⬤◯            | ⬤⬤◯            | ⬤⬤◯           | Good                |
+| SSZ         | ⬤◯◯            | ⬤◯◯            | ⬤◯◯           | Good                |
+| MessagePack | ⬤◯◯            | ⬤◯◯            | ⬤◯◯           | Good                |
+| CBOR        | ⬤◯◯            | ⬤⬤◯            | ⬤◯◯           | Moderate            |
+| JSON        | ⬤◯◯            | ⬤◯◯            | ⬤◯◯           | Moderate            |
+
+## 8. Encoding Evaluation and Conclusion
+
+Benchmark results show meaningful performance differences across formats, especially in large and nested structures. While bincode leads in raw speed for small data, SCALE and Borsh outperform others in large-structure throughput.
+
+**Good alternatives to consider:**
+
+* **Protobuf** – Supports schema evolution, great language support, and offers compact size with moderate performance.
+* **Borsh and SCALE** – Compact and deterministic; good performance, but lack schema evolution and broader tooling.
+
+Each has trade-offs. The decision depends on the need for schema evolution, performance, cross-language support, and implementation simplicity.
+
+## 9. How to run benchmarks
+To run the benchmarks, clone the repository and execute:
+
+```bash
+cargo bench
+```
diff --git a/wire_encodings/benches/benchmark.rs b/wire_encodings/benches/benchmark.rs
@@ -0,0 +1,75 @@
+use criterion::{Criterion, criterion_group, criterion_main};
+use std::time::Duration;
+
+mod common;
+mod formats;
+
+use common::*;
+use formats::*;
+
+fn benchmark_simple_structs(c: &mut Criterion) {
+    let data: Vec<SimpleStruct> = (0..50).map(|_| generate_simple_struct()).collect();
+    let borsh_data: Vec<_> = data.iter().map(convert_to_borsh_simple).collect();
+    let scale_data: Vec<_> = data.iter().map(convert_to_scale_simple).collect();
+    let ssz_data: Vec<_> = data.iter().map(convert_to_ssz_simple).collect();
+    let proto_data: Vec<_> = data.iter().map(convert_to_proto_simple).collect();
+
+    println!("\n=== SIMPLE STRUCTS COMPARISON ===");
+    bench_roundtrip::<SimpleStruct, BincodeFormat>(c, &data, "simple");
+    bench_roundtrip::<SimpleStruct, JsonFormat>(c, &data, "simple");
+    bench_roundtrip::<SimpleStruct, CborFormat>(c, &data, "simple");
+    bench_roundtrip::<SimpleStruct, MessagePackFormat>(c, &data, "simple");
+    bench_roundtrip::<borsh_format::SimpleStructBorsh, BorshFormat>(c, &borsh_data, "simple");
+    bench_roundtrip::<scale_format::SimpleStructScale, ScaleFormat>(c, &scale_data, "simple");
+    bench_roundtrip::<ssz_format::SimpleStructSsz, SszFormat>(c, &ssz_data, "simple");
+    bench_roundtrip::<protobuf_format::SimpleStructProto, ProtobufFormat>(c, &proto_data, "simple");
+}
+
+fn benchmark_binary_structs(c: &mut Criterion) {
+    let data: Vec<BinaryStruct> = (0..50).map(|_| generate_binary_struct()).collect();
+    let borsh_data: Vec<_> = data.iter().map(convert_to_borsh_binary).collect();
+    let scale_data: Vec<_> = data.iter().map(convert_to_scale_binary).collect();
+    let ssz_data: Vec<_> = data.iter().map(convert_to_ssz_binary).collect();
+    let proto_data: Vec<_> = data.iter().map(convert_to_proto_binary).collect();
+
+    println!("\n=== BINARY STRUCTS COMPARISON ===");
+    bench_roundtrip::<BinaryStruct, BincodeFormat>(c, &data, "binary");
+    bench_roundtrip::<BinaryStruct, JsonFormat>(c, &data, "binary");
+    bench_roundtrip::<BinaryStruct, CborFormat>(c, &data, "binary");
+    bench_roundtrip::<BinaryStruct, MessagePackFormat>(c, &data, "binary");
+    bench_roundtrip::<borsh_format::BinaryStructBorsh, BorshFormat>(c, &borsh_data, "binary");
+    bench_roundtrip::<scale_format::BinaryStructScale, ScaleFormat>(c, &scale_data, "binary");
+    bench_roundtrip::<ssz_format::BinaryStructSsz, SszFormat>(c, &ssz_data, "binary");
+    bench_roundtrip::<protobuf_format::BinaryStructProto, ProtobufFormat>(c, &proto_data, "binary");
+}
+
+fn benchmark_large_structs(c: &mut Criterion) {
+    let data: Vec<LargeStruct> = (0..5).map(|_| generate_large_struct()).collect();
+    let borsh_data: Vec<_> = data.iter().map(convert_to_borsh_large).collect();
+    let scale_data: Vec<_> = data.iter().map(convert_to_scale_large).collect();
+    let ssz_data: Vec<_> = data.iter().map(convert_to_ssz_large).collect();
+    let proto_data: Vec<_> = data.iter().map(convert_to_proto_large).collect();
+
+    println!("\n=== LARGE STRUCTS COMPARISON ===");
+    bench_roundtrip::<LargeStruct, BincodeFormat>(c, &data, "large");
+    bench_roundtrip::<LargeStruct, JsonFormat>(c, &data, "large");
+    bench_roundtrip::<LargeStruct, CborFormat>(c, &data, "large");
+    bench_roundtrip::<LargeStruct, MessagePackFormat>(c, &data, "large");
+    bench_roundtrip::<borsh_format::LargeStructBorsh, BorshFormat>(c, &borsh_data, "large");
+    bench_roundtrip::<scale_format::LargeStructScale, ScaleFormat>(c, &scale_data, "large");
+    bench_roundtrip::<ssz_format::LargeStructSsz, SszFormat>(c, &ssz_data, "large");
+    bench_roundtrip::<protobuf_format::LargeStructProto, ProtobufFormat>(c, &proto_data, "large");
+}
+
+criterion_group!(
+    name = combined_benches;
+    config = Criterion::default()
+        .measurement_time(Duration::from_secs(10))
+        .sample_size(100);
+    targets =
+        benchmark_simple_structs,
+        benchmark_binary_structs,
+        benchmark_large_structs
+);
+
+criterion_main!(combined_benches);
diff --git a/wire_encodings/benches/common/mod.rs b/wire_encodings/benches/common/mod.rs
@@ -0,0 +1,123 @@
+use criterion::{Criterion, Throughput};
+use serde::{Deserialize, Serialize};
+use std::collections::HashMap;
+use std::hint::black_box;
+
+#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
+pub struct SimpleStruct {
+    pub id: u32,
+    pub value: u64,
+}
+
+#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
+pub struct BinaryStruct {
+    pub data: Vec<u8>,
+}
+
+#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
+pub struct LargeStruct {
+    pub items: Vec<ItemStruct>,
+    pub map: HashMap<String, ItemStruct>,
+    pub nested: SimpleData,
+    pub blob: Vec<u8>,
+}
+
+#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
+pub struct ItemStruct {
+    pub name: String,
+    pub values: Vec<u32>,
+}
+
+#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
+pub struct SimpleData {
+    pub values: HashMap<String, String>,
+    pub inner: InnerData,
+}
+
+#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
+pub struct InnerData {
+    pub count: u32,
+    pub flag: bool,
+}
+
+pub fn generate_simple_struct() -> SimpleStruct {
+    SimpleStruct {
+        id: 12345,
+        value: 9876543210,
+    }
+}
+
+pub fn generate_binary_struct() -> BinaryStruct {
+    BinaryStruct {
+        data: vec![1, 2, 3, 4, 5],
+    }
+}
+
+pub fn generate_large_struct() -> LargeStruct {
+    let item1 = ItemStruct {
+        name: "item_one".to_string(),
+        values: vec![10, 20, 30],
+    };
+
+    let item2 = ItemStruct {
+        name: "item_two".to_string(),
+        values: vec![40, 50, 60],
+    };
+
+    let mut map = HashMap::new();
+    map.insert("first".to_string(), item1.clone());
+    map.insert("second".to_string(), item2.clone());
+
+    LargeStruct {
+        items: vec![item1, item2],
+        map,
+        nested: SimpleData {
+            values: [
+                ("key1".to_string(), "value1".to_string()),
+                ("key2".to_string(), "value2".to_string()),
+            ]
+            .into_iter()
+            .collect(),
+            inner: InnerData {
+                count: 5000,
+                flag: true,
+            },
+        },
+        blob: vec![0u8; 512],
+    }
+}
+
+pub trait EncodingBenchmark<T> {
+    fn name() -> &'static str;
+    fn encode(data: &T) -> Vec<u8>;
+    fn decode(data: &[u8]) -> T;
+}
+
+pub fn bench_roundtrip<T, F>(c: &mut Criterion, data: &[T], test_name: &str)
+where
+    F: EncodingBenchmark<T>,
+    T: Clone,
+{
+    let mut group = c.benchmark_group(format!("roundtrip_{}", test_name));
+    group.throughput(Throughput::Elements(data.len() as u64));
+
+    group.bench_function(F::name(), |b| {
+        b.iter(|| {
+            for item in data {
+                let encoded = F::encode(black_box(item));
+                let decoded = F::decode(black_box(&encoded));
+                black_box(decoded);
+            }
+        })
+    });
+
+    group.finish();
+
+    let sample_encoded = F::encode(&data[0]);
+    println!(
+        "{} {} - {} bytes per item",
+        F::name(),
+        test_name,
+        sample_encoded.len()
+    );
+}