Skip to content

BENCH-001: Ternary vs f16/bf16 Comparison (Minimal Research) #491

@gHashTag

Description

@gHashTag

Task

Minimal BENCH-001: Compare ternary, f16, bf16 on small MLP using existing .tri specs.

Goal: First honest "ternary vs binary" validation for TRI-27 algorithms.

Approach

  • Use existing specs/algo/mlp.tri (4 → 8 → 3 architecture)
  • Compare: ternary {-1,0,+1} vs fp32 vs fp16 vs bf16
  • Metrics: accuracy, MSE loss, memory usage

Experiments

Experiment Input Weights Metrics
Baseline [1,0,0,0] fp32 loss, accuracy
Ternary [1,0,0,0] {-1,0,+1} loss, accuracy
FP16 [1,0,0,0] fp16 loss, accuracy
BF16 [1,0,0,0] bf16 loss, accuracy

Implementation

File: src/bench_ternary_vs_binary.zig

// 1. fp32 baseline (already exists in test_mlp_semantic.zig)
// 2. Quantize weights to ternary {-1,0,+1}
// 3. Quantize weights to fp16/bf16
// 4. Compare outputs

Success Criteria

  • Ternary weights achieve <5% accuracy loss vs fp32
  • BF16 within 1% of fp32 (expected from BENCH-004)
  • Research log with raw numbers (not marketing)

Dependencies

Timeline

Target: April 5, 2026 (3 days)

Output

results/bench_001_summary.csv:

format,accuracy,loss,memory_bytes
fp32,95.0,0.123,4
ternary,92.0,0.145,1
fp16,94.8,0.124,2
bf16,94.9,0.123,2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions