Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 52 additions & 14 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,28 +16,34 @@ The pipeline: **WebAssembly → Rust source → Safe binary**

## Repository structure

The project is a Rust workspace with three crates:
The project is a Rust workspace with four crates:

| Crate | Purpose | `no_std` |
|-------|---------|----------|
| `crates/herkos/` | CLI transpiler: parses `.wasm` binaries, emits Rust source code | No (`std`) |
| `crates/herkos-core/` | Core transpiler library: parses `.wasm`, builds IR, optimizes, emits Rust | No (`std`) |
| `crates/herkos/` | CLI wrapper around `herkos-core` | No (`std`) |
| `crates/herkos-runtime/` | Runtime library shipped with transpiled output | **Yes** |
| `crates/herkos-tests/` | Integration tests + benchmarks: WAT/C/Rust → .wasm → transpile → test | No (`std`) |

### Transpiler pipeline (`crates/herkos/src/`)
### Transpiler pipeline (`crates/herkos-core/src/`)

```
.wasm → parser/ → ir/builder/ → optimizer/backend/safe.rs → codegen/ → rustfmt
(wasmparser) (SSA IR) (dead blocks) (SafeBackend) (Rust source)
.wasm → parser/ → ir/builder/ → optimize_ir()lower_phis() → optimize_lowered_ir() → codegen/ → rustfmt
(wasmparser) (SSA IR) (pre-lowering) (SSA destruct) (post-lowering) (Rust source)
```

Key modules:
- `parser/` — Wasm binary parsing via `wasmparser` crate
- `ir/` — SSA-form intermediate representation (`ModuleInfo`, `IrFunction`, `IrBlock`, `IrInstr`)
- `ir/` — SSA-form intermediate representation
- `ir/types.rs` — `ModuleInfo`, `IrFunction`, `IrBlock`, `IrInstr`, `VarId`, `DefVar`, `UseVar`, `BlockId`, etc.
- `ir/builder/` — Wasm → IR translation (core.rs, translate.rs, assembly.rs, analysis.rs)
- `optimizer/` — IR optimization passes (currently: dead block elimination)
- `ir/lower_phis.rs` — SSA destruction: phi nodes → predecessor `Assign` instructions
- `optimizer/` — IR optimization passes, split into two phases:
- **Pre-lowering** (on SSA IR with phi nodes): `dead_blocks`, `const_prop`, `algebraic`, `copy_prop`
- **Post-lowering** (on phi-free IR): `empty_blocks`, `dead_blocks`, `merge_blocks`, `copy_prop`, `local_cse`, `gvn`, `dead_instrs`, `branch_fold`, `licm`
- `backend/` — Backend trait + `SafeBackend` (bounds-checked, no unsafe)
- `codegen/` — IR → Rust source (module.rs, function.rs, instruction.rs, traits.rs, export.rs, constructor.rs)
- `codegen/` — IR → Rust source (module.rs, function.rs, instruction.rs, traits.rs, export.rs, constructor.rs, env.rs, types.rs, utils.rs)
- `c_ffi.rs` — C-compatible FFI wrapper around `transpile()`

### Runtime (`crates/herkos-runtime/src/`)

Expand All @@ -58,24 +64,48 @@ Key modules:

```bash
cargo build # build all crates
cargo test # run all tests
cargo clippy --all-targets # lint (CI enforced)
cargo fmt --check # format check (CI enforced)
cargo bench -p herkos-tests # benchmarks
```

Run a single crate's tests:
```bash
cargo test -p herkos
cargo test -p herkos-core # transpiler unit tests (IR, optimizer, codegen)
cargo test -p herkos-runtime
cargo test -p herkos-tests
```

**`herkos-tests` must always be run twice** — once with optimizations off, once on — to verify that the optimizer does not change observable behavior:

```bash
HERKOS_OPTIMIZE=0 cargo test -p herkos-tests # unoptimized output
HERKOS_OPTIMIZE=1 cargo test -p herkos-tests # optimized output
```

`HERKOS_OPTIMIZE` is consumed by `herkos-tests/build.rs` at compile time to control whether the transpiled test modules are generated with `-O` or not. It has no effect on `herkos-core`, `herkos-runtime`, or production code. CI enforces both runs. For all other crates, `cargo test` without this variable is sufficient.

CLI usage:
```bash
cargo run -p herkos -- input.wasm --output output.rs
cargo run -p herkos -- input.wasm --output output.rs # transpile
cargo run -p herkos -- input.wasm -O --output output.rs # with optimizations
```

### Sphinx documentation

The `docs/` directory is a Sphinx project using [MyST](https://myst-parser.readthedocs.io/) (Markdown) and [sphinx-needs](https://sphinx-needs.readthedocs.io/) (traceability directives). Build with:

```bash
cd docs
python -m venv .venv && source .venv/bin/activate # first time only
pip install -r requirements.txt # first time only

make html # generate auto-files then build HTML → _build/html/index.html
make clean # remove build artifacts
```

`make html` runs `python scripts/generate_all.py` first (auto-generates need files), then calls `sphinx-build`.

## Key architectural concepts

### Memory model
Expand Down Expand Up @@ -113,15 +143,23 @@ See SPECIFICATION.md §4.5.
- `WasmResult<T> = Result<T, WasmTrap>` — no panics, no unwinding
- `ConstructionError` for programming errors during module instantiation

### SSA IR and phi lowering

The IR is pure SSA: every variable is defined exactly once (`DefVar` token, non-`Copy`; enforced at build time). Phi nodes at join points are lowered to predecessor `Assign` instructions before codegen by `lower_phis::lower()`, which returns a `LoweredModuleInfo` newtype that statically guarantees no phi nodes remain. Optimization runs in two phases: **pre-lowering** (on SSA IR with phi nodes) and **post-lowering** (on `LoweredModuleInfo`).

### Env<H> context pattern

Functions that call imports or read/write mutable globals receive an `Env<'_, H>` parameter bundling `host: &mut H` and `globals: &mut Globals`. This avoids threading host + globals as separate parameters throughout every function signature.

### Current status

- **Implemented**: Safe backend only (runtime bounds checking, no unsafe in output)
- **Not yet implemented**: Verified backend, hybrid backend, `--max-pages` CLI effect, WASI traits
- **Implemented**: Safe backend only (runtime bounds checking, no unsafe in output), full optimizer pipeline, C FFI
- **Not yet implemented**: Verified backend, hybrid backend, `--max-pages` CLI flag, WASI traits
- See `docs/FUTURE.md` for planned features

## `no_std` constraint

`herkos-runtime` and all transpiled output **must be `#![no_std]`**. No heap allocation without the optional `alloc` feature gate. No panics, no `format!`, no `String`. Errors are `Result<T, WasmTrap>` only. The `herkos` CLI crate is a standard `std` binary.
`herkos-runtime` and all transpiled output **must be `#![no_std]`**. No heap allocation without the optional `alloc` feature gate. No panics, no `format!`, no `String`. Errors are `Result<T, WasmTrap>` only. The `herkos` CLI and `herkos-core` crates are standard `std` binaries/libraries.

## Coding conventions

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,8 +151,8 @@ cargo bench -p herkos-tests # benchmarks

## Documentation

- [Requirements](docs/REQUIREMENTS.md) — formal requirements (REQ_* IDs)
- [Specification](docs/SPECIFICATION.md) — architecture, transpilation rules, memory model, security analysis
- [Requirements](docs/REQUIREMENTS.rst) — formal requirements (REQ_* IDs)
- [Specification](docs/SPECIFICATION.rst) — architecture, transpilation rules, memory model, security analysis
- [Future work](docs/FUTURE.md) — verified backend, hybrid backend, temporal isolation

## License
Expand Down
1 change: 1 addition & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
_gen/
4 changes: 2 additions & 2 deletions docs/FUTURE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Future Extensions

This document describes features that are **planned but not yet implemented**. For the current specification, see [SPECIFICATION.md](SPECIFICATION.md).
This document describes features that are **planned but not yet implemented**. For the current specification, see [Specification](specification/index.rst).

---

Expand Down Expand Up @@ -183,4 +183,4 @@ fn load_i32_verified(memory: &IsolatedMemory<MAX_PAGES>, offset: u32) -> i32 {
- **Automated refactoring suggestions** for better Rust idioms in generated code
- **DWARF debug info preservation** for source-level debugging of transpiled code
- **Proof coverage reports**: per-function and per-module percentage of accesses that are proven vs. runtime-checked
- **Dynamic linking** of transpiled modules (open question — see [SPECIFICATION.md §8](SPECIFICATION.md#8-open-questions))
- **Dynamic linking** of transpiled modules (open question — see {ref}`SPECIFICATION §7 <open-questions>`)
164 changes: 164 additions & 0 deletions docs/GETTING_STARTED.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Getting Started

## Installation

From crates.io:
```bash
cargo install herkos
```

From source:
```bash
git clone https://github.com/arnoox/herkos.git
cd herkos
cargo install --path crates/herkos
```

## Basic Usage

```bash
herkos input.wasm --output output.rs
herkos input.wasm -O --output output.rs # with IR optimizations enabled
```

| Option | Description | Required |
|--------|-------------|----------|
| `input.wasm` | Path to WebAssembly module | Yes |
| `--output`, `-o` | Output Rust file path (defaults to stdout) | No |
| `--optimize`, `-O` | Enable IR optimization passes | No |

## Understanding the Output

The transpiler produces a self-contained Rust source file that depends only on `herkos-runtime`. The output contains:

```rust
// Generated output.rs
use herkos_runtime::*;

struct Globals { ... } // ← mutable globals
const G1: i64 = 42; // ← immutable

fn func_0(...) { ... } // ← Wasm functions
fn func_1(...) { ... }

struct Module<MAX_PAGES, TABLE_SIZE> {
memory: IsolatedMemory<MAX_PAGES>,
globals: Globals,
table: Table<TABLE_SIZE>,
}

impl Module { ... } // ← exports as methods
trait ModuleImports { ... } // ← required capabilities
```

## Using Transpiled Code

### Direct inclusion

```rust
use herkos_runtime::{IsolatedMemory, WasmResult};

include!("path/to/output.rs");

fn main() -> WasmResult<()> {
let mut module = Module::<256, 4>::new(
16, // initial pages
Globals::default(), // module globals
Table::default(), // call table
)?;

let result = module.my_function(42)?;
println!("Result: {}", result);
Ok(())
}
```

### Via build.rs (recommended for automated workflows)

```rust
// build.rs
use std::env;
use std::path::PathBuf;

fn main() {
let out_dir = env::var("OUT_DIR").unwrap();
let out_path = PathBuf::from(&out_dir);

println!("cargo:rerun-if-changed=wasm-modules/math.wasm");

let wasm_bytes = std::fs::read("wasm-modules/math.wasm").unwrap();
let options = herkos::TranspileOptions::default();
let rust_code = herkos::transpile(&wasm_bytes, &options).unwrap();
std::fs::write(out_path.join("math_module.rs"), rust_code).unwrap();
}
```

```rust
// src/main.rs
use herkos_runtime::WasmResult;
include!(concat!(env!("OUT_DIR"), "/math_module.rs"));

fn main() -> WasmResult<()> {
let mut module = Module::<16, 0>::new(1, Globals::default(), Table::default())?;
let result = module.add(5, 3)?;
println!("Result: {}", result);
Ok(())
}
```

When including multiple modules, wrap them in Rust modules to avoid name collisions:

```rust
mod math {
include!(concat!(env!("OUT_DIR"), "/math_module.rs"));
}
mod crypto {
include!(concat!(env!("OUT_DIR"), "/crypto_module.rs"));
}
```

## Example: C to Rust via Wasm

This example walks through the full pipeline: starting from a C source file,
compiling it to a Wasm binary, and then using `herkos` to transpile that binary
into safe Rust code you can call directly.

Start with a simple C library:

```c
// math.c
int add(int a, int b) { return a + b; }
int multiply(int a, int b) { return a * b; }
```

Compile it to Wasm using `clang` and `wasm-ld`, then transpile with `herkos`:

```bash
clang --target=wasm32 -O2 -c math.c -o math.o
wasm-ld math.o -o math.wasm --no-entry
herkos math.wasm --output math.rs
```

`--no-entry` tells the linker not to require a `main` symbol, since this is a
library. The transpiler reads `math.wasm` and writes `math.rs`, a self-contained
Rust module that only depends on `herkos-runtime`.

Include the generated file and instantiate the module to call its exports:

```rust
use herkos_runtime::WasmResult;
include!("math.rs");

fn main() -> WasmResult<()> {
let mut module = Module::<16, 0>::new(1, Globals::default(), Table::default())?;
println!("2 + 3 = {}", module.add(2, 3)?);
println!("4 * 5 = {}", module.multiply(4, 5)?);
Ok(())
}
```

The const generics `<16, 0>` set the memory limit to 16 pages (1 MiB) and the
indirect-call table size to 0, since this module makes no indirect calls. Each
exported function returns `WasmResult<T>`, propagating any Wasm traps (such as
out-of-bounds memory access or integer overflow) as `Err(WasmTrap)` rather than
panicking.
19 changes: 19 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
SPHINXBUILD = sphinx-build
SOURCEDIR = .
BUILDDIR = _build

.PHONY: help html generate clean

help:
@echo " html build HTML documentation"
@echo " generate regenerate auto-generated need files"
@echo " clean remove build artifacts"

generate:
python3 scripts/generate_all.py

html: generate
$(SPHINXBUILD) -b html $(SOURCEDIR) $(BUILDDIR)/html

clean:
rm -rf $(BUILDDIR)
Loading
Loading