Hi,
I implemented a structured benchmarking module for evaluating eye-tracking accuracy and precision, aligned with the project’s validation goals.
Key additions:
- Accuracy metrics (mean, median, p95) in pixels and degrees
- Scientifically grounded per-target RMS precision metric
- Data quality reporting
- Per-target accuracy breakdown
- New API endpoint: POST /api/session/benchmark
- Input validation and NumPy serialization handling
This enables reproducible and standardized benchmarking across devices and setups.
Before opening a PR, I’d like feedback on:
- Metric definitions
- API structure
- Naming conventions
- Integration approach
Looking forward to your thoughts.
Hi,
I implemented a structured benchmarking module for evaluating eye-tracking accuracy and precision, aligned with the project’s validation goals.
Key additions:
This enables reproducible and standardized benchmarking across devices and setups.
Before opening a PR, I’d like feedback on:
Looking forward to your thoughts.