Publish Performance Benchmarks

We need to publish a table containing performance benchmark numbers that is about how many traces (throughput) Pythia can process under different conditions.

1.Single Validator (hallucination detection alone)
2.Multiple validators
3.External LLM (different models, GPT-4, GPT-4-mini etc)
4.Different hardware configurations (RAM, CPU)