BENCH-001: Ternary vs f16/bf16 Comparison (Minimal Research)

## Task

Minimal BENCH-001: Compare ternary, f16, bf16 on small MLP using existing .tri specs.

**Goal**: First honest "ternary vs binary" validation for TRI-27 algorithms.

**Approach**
- Use existing `specs/algo/mlp.tri` (4 → 8 → 3 architecture)
- Compare: ternary {-1,0,+1} vs fp32 vs fp16 vs bf16
- Metrics: accuracy, MSE loss, memory usage

## Experiments

| Experiment | Input | Weights | Metrics |
|-----------|-------|---------|---------|
| Baseline | [1,0,0,0] | fp32 | loss, accuracy |
| Ternary | [1,0,0,0] | {-1,0,+1} | loss, accuracy |
| FP16 | [1,0,0,0] | fp16 | loss, accuracy |
| BF16 | [1,0,0,0] | bf16 | loss, accuracy |

## Implementation

**File**: `src/bench_ternary_vs_binary.zig`

```zig
// 1. fp32 baseline (already exists in test_mlp_semantic.zig)
// 2. Quantize weights to ternary {-1,0,+1}
// 3. Quantize weights to fp16/bf16
// 4. Compare outputs
```

## Success Criteria

- [ ] Ternary weights achieve <5% accuracy loss vs fp32
- [ ] BF16 within 1% of fp32 (expected from BENCH-004)
- [ ] Research log with raw numbers (not marketing)

## Dependencies

- Uses: specs/algo/mlp.tri, src/formats.zig (GF16 encode/decode)
- Related: #479 (BENCH-001 parent issue)

## Timeline

**Target**: April 5, 2026 (3 days)

## Output

`results/bench_001_summary.csv`:
```
format,accuracy,loss,memory_bytes
fp32,95.0,0.123,4
ternary,92.0,0.145,1
fp16,94.8,0.124,2
bf16,94.9,0.123,2
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BENCH-001: Ternary vs f16/bf16 Comparison (Minimal Research) #491

Task

Experiments

Implementation

Success Criteria

Dependencies

Timeline

Output

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Experiment	Input	Weights	Metrics
Baseline	[1,0,0,0]	fp32	loss, accuracy
Ternary	[1,0,0,0]	{-1,0,+1}	loss, accuracy
FP16	[1,0,0,0]	fp16	loss, accuracy
BF16	[1,0,0,0]	bf16	loss, accuracy

Uh oh!

BENCH-001: Ternary vs f16/bf16 Comparison (Minimal Research) #491

Description

Task

Experiments

Implementation

Success Criteria

Dependencies

Timeline

Output

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions