-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output vizualizer #2744
base: main
Are you sure you want to change the base?
Output vizualizer #2744
Conversation
array_filter would only filter out the other judgehosts but still return an array of judgehost objects. By selecting the first (and only item) we can now get the `enabled` property and properly check. ``` array(1) { [1]=> array(5) { ["id"]=> string(1) "2" ["hostname"]=> string(8) "judgehost" ["enabled"]=> bool(true) ["polltime"]=> string(20) "1728821560.017400000" ["hidden"]=> bool(false) } } ```
Such a script will most likely never be a generic script.
This is untested yet as the syntax should be further discussed.
This assumes a similar invocation as for the output_validator, the other alternative is to add this to the domjudge-problem.ini file if this doesn't end up in the ICPC problem spec.
Getting this directly via submission.problem.output_visualizer_executable (the property) seems to fail. It does show up in the twig dump but the translation fails when using it.
This is needed during testing of the job on the judgehost
Debugging via the API is hard, so we hardcode the search and just copy it over in a next commit.
@@ -768,7 +771,7 @@ function fetch_executable_internal( | |||
$judgehosts = request('judgehosts', 'GET'); | |||
if ($judgehosts !== null) { | |||
$judgehosts = dj_json_decode($judgehosts); | |||
$judgehost = array_filter($judgehosts, fn($j) => $j['hostname'] === $myhost); | |||
$judgehost = array_values(array_filter($judgehosts, fn($j) => $j['hostname'] === $myhost))[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of your other PR
$run_config['hash'] | ||
); | ||
if (isset($error)) { | ||
// FIXME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix me 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loosely based on what we do for the debug tasks, I'll see if I can merge the shared parts and possibly fix this one.
$debug_package = base64_decode($request->request->get('visual_output')); | ||
file_put_contents($tempFilename, $debug_package); | ||
} | ||
// FIXME: error checking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another fixme
@@ -1478,6 +1533,7 @@ public function getJudgeTasksAction(Request $request): array | |||
throw new BadRequestHttpException('Argument \'hostname\' is mandatory'); | |||
} | |||
$hostname = $request->request->get('hostname'); | |||
$hostname = 'Computer'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops
throw new BadRequestHttpException( | ||
'Inconsistent data, no judging known with judgingid = ' . $judgeTask->getJobId() . '.'); | ||
} | ||
if ($tempFilename = tempnam($this->dj->getDomjudgeTmpDir(), "visual-")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means it doesn't work with 2 servers in HA mode. We should store this in the DB as a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is loosely based on what we do with debug packages, so we have the same problem there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I didn't know. Then maybe for now that's fine but we should fix it at some point?
@@ -532,6 +532,12 @@ | |||
{{ runs | displayTestcaseResults(judgingDone) }} | |||
</td> | |||
<td> | |||
{% if hasOutputVisualizer and judgingDone %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add and not visualization is defined
?
vals = [] | ||
for line in lines: | ||
if 'READ' in line: | ||
vals.append(int(line.split(' ')[-1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks if the submission writes bogus output, e.g. READ domjudge
, how do we handle cases like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say this is something which is left up to the jury.
I wonder if there is something we can do with the output_validator & the judgemessage for communication. Either an extra exitcode or otherwise the jury has to parse the output in this script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's left to the jury. In this case you are the jury by implementing the visualizer :-)
Since people are going to model after examples we give them, we should make it robust and not crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding to this, I think we should return 42 in the script if we were able to visualize the output, and 43 if not.
@@ -1,3 +1,4 @@ | |||
name: Boolean switch search | |||
|
|||
validation: custom interactive | |||
visualization: default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that already spec'ed out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is not something which exists in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was asking because I think this should be custom
instead of default
since there is no default visualizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any string will do at this moment, would be nice to get @eldering his opinion as I just picked something. Will change it to custom
for now.
@@ -334,7 +334,10 @@ function fetch_executable_internal( | |||
$execrunpath = $execbuilddir . '/run'; | |||
$execrunjurypath = $execbuilddir . '/runjury'; | |||
if (!is_dir($execdir) || !file_exists($execdeploypath)) { | |||
system('rm -rf ' . dj_escapeshellarg($execdir) . ' ' . dj_escapeshellarg($execbuilddir)); | |||
system('rm -rf ' . dj_escapeshellarg($execdir) . ' ' . dj_escapeshellarg($execbuilddir), $retval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated to this PR
continue; | ||
} | ||
|
||
$teamoutput = $workdir . "/testcase" . sprintf('%05d', $judgeTask['testcase_id']) . '/1/program.out'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does the /1/
here come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting the output of the first run, so in other words, this does not work for multipass problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right - then please add a TODO and consider making the one a local variabled, e.g. $pass = 1;
and then using that in the path construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would we want to do in this case? I assume we want to visualize actually the last pass?
[$runpath, $teamoutput, $tmpfile])); | ||
system($visual_cmd, $retval); | ||
if ($retval !== 0) { | ||
error("Running '$runpath' failed."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rather report an internal error like we do in other cases (and not crash judgedaemons)
$programStrings = []; | ||
$programStrings['package_dir'] = 'output_validators/'; | ||
$programStrings['type'] = 'output validator'; | ||
$programStrings['clash'] = 'cmp'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does clash
stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming clash, this is copied mostly from how the output_validator is uploaded.
Do we limit upload filesize anywhere? |
'Inconsistent data, no judging known with judgingid = ' . $judgeTask->getJobId() . '.'); | ||
} | ||
if ($tempFilename = tempnam($this->dj->getDomjudgeTmpDir(), "visual-")) { | ||
$debug_package = base64_decode($request->request->get('visual_output')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: update name (no longer debug_package
)
#[OA\Response(response: 200, description: 'When the visual output has been added')] | ||
public function addVisualization( | ||
Request $request, | ||
#[OA\PathParameter(description: 'The hostname of the judgehost that wants to add the debug info')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is not debug info
'comment' => 'Team output visualization.', | ||
])] | ||
#[ORM\Index(columns: ['judgingid'], name: 'judgingid')] | ||
class Visualization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allow a visualizer to fail gracefully (see exit code 43 message above), we need to be able to record this here (and perhaps a message?).
One implication of current implementation (which is reasonable) is that if you shut down a judgehost, you are not going to get the visualization. We should probably think about a good way to message this to the user. |
Out of scope for this PR, we could have a toggle at the problem level to visualize the output right after judging a test case (perhaps filtered down by verdict, e.g. only for wrong-answer). |
That problem is also there for the debug package I think. It wouldn't be that hard to extend here to optionally calculate the result on the testcase but if a submission is non-deterministic we would get a different visualization as what the original team-output had. Your remark #2744 (comment) would prevent the problem but would give a problem when visualization takes a some time and there is a backlog. |
I assume still with the priority of first the submission work and only if there is nothing do the visualization or do you want to extend the queued work to have a {Run+Visual}[x] with x as the length of the worklist? |
I tried to implement #367. This is still a bit rough but I rather first get some feedback instead of continuing with this if this is not what @eldering envisioned.
The main problem I currently have is which testcases to visualize, do we only want the failing one and if so only the
wrong-answer
one or do we just want to try all of them and accept that the script might fail as the team output might not conform to the expected output. (For example: the team might have stopped halfway through the problem or might still have debug output etc.)The visualized team output for boolfind is ofcourse not very helpful but as this is only a demo this is the best for now.
When implementing this I found out we could just always visualize as a task with lower priority and update the visualization task with the judgehost which should have the needed files. I wrongly assume that the file will always be at the judgehost to keep this simple but that should still be fixed if we want to continue with this.