Skip to content

feat(runner): add optional timing collection for grading workflows#160

Open
Alexxigang wants to merge 1 commit intoagentscope-ai:mainfrom
Alexxigang:feat/grading-runner-timing
Open

feat(runner): add optional timing collection for grading workflows#160
Alexxigang wants to merge 1 commit intoagentscope-ai:mainfrom
Alexxigang:feat/grading-runner-timing

Conversation

@Alexxigang
Copy link
Copy Markdown

Summary

  • add a reusable TimingCollector utility under openjudge/utils/timer.py with context-manager and decorator support
  • instrument GradingRunner to optionally collect latency records for single evaluations, whole datasets, and multi-dataset runs
  • expose timing records and summaries through get_timing_records(), get_timing_summary(), and clear_timing_records()
  • add unit tests for the timing utility and runner timing integration

Why this fix

OpenJudge did not have a built-in way to measure the latency of core evaluation workflow steps, which made it harder to identify bottlenecks or track regressions over time.

This PR introduces a lightweight timing utility that logs at DEBUG level by default and stores in-memory timing records for programmatic access. As a pilot integration, the grading workflow now supports optional timing collection without changing existing behavior when timing is disabled.

Closes #81.

Validation

  • PYTHONPATH=. pytest tests/utils/test_timer.py tests/runner/test_grading_runner.py -q

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add Time Consumption Evaluation for Key Operations

1 participant