Implement analyst evaluation mode #3025

Kevinjil · 2025-07-06T10:48:50Z

The idea behind this mode is that we are not really shadowing and trust the event feed results. However, we want to judge "interesting" runs locally to get useful information without the judging capacity to judge all testcases due to limited judgehost assignment.

We do not consider 'TLE' or 'AC' interesting, as rerunning will not yield much more information. We consider 'WA' very interesting and prioritize the judging, but allow manual judging to overtake the priority. We consider 'CE' somewhat interesting, but downprioritize them a lot. For other verdicts, keep the normal priority.

webapp/src/Service/ExternalContestSourceService.php

webapp/src/Controller/API/JudgehostController.php

The idea behind this mode is that we are not really shadowing and trust the event feed results. However, we want to judge "interesting" runs locally to get useful information without the judging capacity to judge all testcases due to limited judgehost assignment. We do not consider 'TLE' or 'AC' interesting, as rerunning will not yield much more information. We consider 'WA' very interesting and prioritize the judging, but allow manual judging to overtake the priority. We consider 'CE' somewhat interesting, but downprioritize them a lot. For other verdicts, keep the normal priority.

When the QueueTask is already processed before the arrival of the upstream verdict, the udpated JudgeTask will never be picked up by a judge daemon. To ensure that the interesting run is actually picked up, create a new QueueTask. Note that, in case of an already existing unprocessed queue task, this should not break anything.

webapp/src/Service/ExternalContestSourceService.php

Otherwise there is a hidden dependency between the 'actual' constants, and these constants. So instead of `-9` use `JudgeTask::PRIORITY_HIGH + 1` and instead of `9` use `JudgeTask::PRIORITY_LOW - 1`. Suggested-by: Mart Pluijmaekers <[email protected]>

Kevinjil force-pushed the analyst-eval branch 2 times, most recently from 299c4a2 to 60f966d Compare July 6, 2025 11:47

Kevinjil marked this pull request as ready for review July 6, 2025 15:03

vmcj reviewed Jul 6, 2025

View reviewed changes

webapp/src/Service/ExternalContestSourceService.php Show resolved Hide resolved

Kevinjil force-pushed the analyst-eval branch from 60f966d to 2f6c3ec Compare September 3, 2025 06:01

tuupke requested changes Sep 3, 2025

View reviewed changes

webapp/src/Controller/API/JudgehostController.php Outdated Show resolved Hide resolved

Kevinjil force-pushed the analyst-eval branch 3 times, most recently from b1429a8 to ca69da5 Compare September 3, 2025 07:15

Kevinjil added 2 commits September 3, 2025 11:16

Allow judging the remaining testcases as analyst

e609d99

Kevinjil force-pushed the analyst-eval branch from ca69da5 to e609d99 Compare September 3, 2025 07:16

tuupke reviewed Sep 3, 2025

View reviewed changes

webapp/src/Service/ExternalContestSourceService.php Outdated Show resolved Hide resolved

Kevinjil added 2 commits September 3, 2025 10:25

Judging the remaining tasks is most important

fc7a681

Kevinjil requested a review from tuupke September 3, 2025 11:22

tuupke approved these changes Sep 3, 2025

View reviewed changes

Kevinjil added this pull request to the merge queue Sep 3, 2025

Merged via the queue into main with commit 7577142 Sep 3, 2025
42 of 47 checks passed

Kevinjil deleted the analyst-eval branch September 3, 2025 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement analyst evaluation mode #3025

Implement analyst evaluation mode #3025

Uh oh!

Kevinjil commented Jul 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Implement analyst evaluation mode #3025

Implement analyst evaluation mode #3025

Uh oh!

Conversation

Kevinjil commented Jul 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!