Four benchmarks covering different aspects of NIMA's cognitive architecture.
What it tests: Pure retrieval — "is the correct memory in top K?"
Metrics: Recall@1, Recall@5, Recall@10, Recall@20, MRR
Independent of: LLM synthesis, emotional context
Why it matters: Foundation layer — if retrieval fails, nothing else works
Run:
cd ~/.openclaw/workspace/PROJECTS/nima-bench
python3 -c "from synthetic.benchmark import TESTS, run_synthetic_benchmark; run_synthetic_benchmark()"What it tests: How fast does NIMA retrieve at scale?
Metrics: Latency (ms) at 1K, 10K, 100K memories
Tests: Vector-only, Vector+Graph, Full RRF pipeline
Why it matters: Personal AI needs to be fast. 100ms or less per query.
Run:
python3 speed_benchmark/benchmark.py --scales 1000 10000 100000What it tests: Does NIMA understand the EMOTIONAL context of memories?
Metrics: Human-judged 1-5 on: emotional context, factual accuracy, synthesis depth
What makes NIMA different: No other memory system measures this
Why it matters: Raw retrieval wins on synthetic benchmarks. Emotional synthesis is NIMA's edge.
Run (interactive):
python3 emotional_synthesis/benchmark.pyWhat it tests: Does NIMA maintain a consistent, evolving model of David?
Metrics: Human-judged on: consistency, accuracy, drift detection
What it probes: Relationships, belief evolution, stale understanding
Why it matters: A good memory system remembers who you ARE, not just what you said
Run (interactive):
python3 personal_continuity/benchmark.py# List all benchmarks
python3 runner.py --list
# Run everything
python3 runner.py --all
# Run specific benchmarks
python3 runner.py --speed --synthetic
python3 runner.py --emotional --continuity
# Save results
python3 runner.py --all --output ./results/2026-04-10MemPalace proved: raw verbatim + embeddings beats LLM-extracted structured memories on synthetic benchmarks.
NIMA's cognitive layers (affect, synthesis) aren't tested by R@K benchmarks. They're tested by the emotional_synthesis and personal_continuity benchmarks — which require human judgment.
This is NIMA's frontier. Push there.