Output vizualizer #2744

vmcj · 2024-10-14T21:04:33Z

I tried to implement #367. This is still a bit rough but I rather first get some feedback instead of continuing with this if this is not what @eldering envisioned.

The main problem I currently have is which testcases to visualize, do we only want the failing one and if so only the wrong-answer one or do we just want to try all of them and accept that the script might fail as the team output might not conform to the expected output. (For example: the team might have stopped halfway through the problem or might still have debug output etc.)

The visualized team output for boolfind is ofcourse not very helpful but as this is only a demo this is the best for now.

When implementing this I found out we could just always visualize as a task with lower priority and update the visualization task with the judgehost which should have the needed files. I wrongly assume that the file will always be at the judgehost to keep this simple but that should still be fixed if we want to continue with this.

array_filter would only filter out the other judgehosts but still return an array of judgehost objects. By selecting the first (and only item) we can now get the `enabled` property and properly check. ``` array(1) { [1]=> array(5) { ["id"]=> string(1) "2" ["hostname"]=> string(8) "judgehost" ["enabled"]=> bool(true) ["polltime"]=> string(20) "1728821560.017400000" ["hidden"]=> bool(false) } } ```

Such a script will most likely never be a generic script.

This is untested yet as the syntax should be further discussed.

This assumes a similar invocation as for the output_validator, the other alternative is to add this to the domjudge-problem.ini file if this doesn't end up in the ICPC problem spec.

Getting this directly via submission.problem.output_visualizer_executable (the property) seems to fail. It does show up in the twig dump but the translation fails when using it.

This is needed during testing of the job on the judgehost

Debugging via the API is hard, so we hardcode the search and just copy it over in a next commit.

nickygerritsen · 2024-10-19T17:47:26Z

judge/judgedaemon.main.php

@@ -768,7 +771,7 @@ function fetch_executable_internal(
            $judgehosts = request('judgehosts', 'GET');
            if ($judgehosts !== null) {
                $judgehosts = dj_json_decode($judgehosts);
-                $judgehost = array_filter($judgehosts, fn($j) => $j['hostname'] === $myhost);
+                $judgehost = array_values(array_filter($judgehosts, fn($j) => $j['hostname'] === $myhost))[0];


This is part of your other PR

nickygerritsen · 2024-10-19T17:47:51Z

judge/judgedaemon.main.php

+                    $run_config['hash']
+                );
+                if (isset($error)) {
+                    // FIXME


Fix me 😛

Loosely based on what we do for the debug tasks, I'll see if I can merge the shared parts and possibly fix this one.

nickygerritsen · 2024-10-19T17:49:33Z

webapp/src/Controller/API/JudgehostController.php

+            $debug_package = base64_decode($request->request->get('visual_output'));
+            file_put_contents($tempFilename, $debug_package);
+        }
+        // FIXME: error checking


Another fixme

nickygerritsen · 2024-10-19T17:49:41Z

webapp/src/Controller/API/JudgehostController.php

@@ -1478,6 +1533,7 @@ public function getJudgeTasksAction(Request $request): array
            throw new BadRequestHttpException('Argument \'hostname\' is mandatory');
        }
        $hostname = $request->request->get('hostname');
+        $hostname = 'Computer';


nickygerritsen · 2024-10-19T17:51:00Z

webapp/src/Controller/API/JudgehostController.php

+            throw new BadRequestHttpException(
+                'Inconsistent data, no judging known with judgingid = ' . $judgeTask->getJobId() . '.');
+        }
+        if ($tempFilename = tempnam($this->dj->getDomjudgeTmpDir(), "visual-")) {


This means it doesn't work with 2 servers in HA mode. We should store this in the DB as a file.

This is loosely based on what we do with debug packages, so we have the same problem there.

Ah I didn't know. Then maybe for now that's fine but we should fix it at some point?

nickygerritsen · 2024-10-19T17:52:27Z

webapp/templates/jury/submission.html.twig

@@ -532,6 +532,12 @@
                                {{ runs | displayTestcaseResults(judgingDone) }}
                            </td>
                            <td>
+                                {% if hasOutputVisualizer and judgingDone %}


Maybe add and not visualization is defined?

meisterT · 2024-10-22T16:56:16Z

Getting this error when clicking the "visualize" button:

meisterT · 2024-10-22T16:58:16Z

example_problems/boolfind/output_visualizer/boolfind_visual/visualize.py

+  vals = []
+  for line in lines:
+    if 'READ' in line:
+      vals.append(int(line.split(' ')[-1]))


This breaks if the submission writes bogus output, e.g. READ domjudge, how do we handle cases like this?

I would say this is something which is left up to the jury.

I wonder if there is something we can do with the output_validator & the judgemessage for communication. Either an extra exitcode or otherwise the jury has to parse the output in this script.

Yes, it's left to the jury. In this case you are the jury by implementing the visualizer :-)

Since people are going to model after examples we give them, we should make it robust and not crash.

Adding to this, I think we should return 42 in the script if we were able to visualize the output, and 43 if not.

meisterT · 2024-10-22T16:58:32Z

example_problems/boolfind/problem.yaml

@@ -1,3 +1,4 @@
 name: Boolean switch search

 validation: custom interactive
+visualization: default


is that already spec'ed out?

No, this is not something which exists in the spec.

I was asking because I think this should be custom instead of default since there is no default visualizer.

any string will do at this moment, would be nice to get @eldering his opinion as I just picked something. Will change it to custom for now.

meisterT · 2024-10-22T16:58:55Z

judge/judgedaemon.main.php

@@ -334,7 +334,10 @@ function fetch_executable_internal(
    $execrunpath     = $execbuilddir . '/run';
    $execrunjurypath = $execbuilddir . '/runjury';
    if (!is_dir($execdir) || !file_exists($execdeploypath)) {
-        system('rm -rf ' . dj_escapeshellarg($execdir) . ' ' . dj_escapeshellarg($execbuilddir));
+        system('rm -rf ' . dj_escapeshellarg($execdir) . ' ' . dj_escapeshellarg($execbuilddir), $retval);


unrelated to this PR

meisterT · 2024-10-22T17:01:05Z

judge/judgedaemon.main.php

+                    continue;
+                }
+
+                $teamoutput = $workdir . "/testcase" . sprintf('%05d', $judgeTask['testcase_id']) . '/1/program.out';


where does the /1/ here come from?

Getting the output of the first run, so in other words, this does not work for multipass problems.

Ah, right - then please add a TODO and consider making the one a local variabled, e.g. $pass = 1; and then using that in the path construction.

What would we want to do in this case? I assume we want to visualize actually the last pass?

meisterT · 2024-10-22T17:01:45Z

judge/judgedaemon.main.php

+                    [$runpath, $teamoutput, $tmpfile]));
+                system($visual_cmd, $retval);
+                if ($retval !== 0) {
+                    error("Running '$runpath' failed.");


let's rather report an internal error like we do in other cases (and not crash judgedaemons)

meisterT · 2024-10-22T17:03:35Z

webapp/src/Service/ImportProblemService.php

+        $programStrings = [];
+        $programStrings['package_dir'] = 'output_validators/';
+        $programStrings['type'] = 'output validator';
+        $programStrings['clash'] = 'cmp';


what does clash stand for?

naming clash, this is copied mostly from how the output_validator is uploaded.

meisterT · 2024-10-22T17:04:19Z

Do we limit upload filesize anywhere?

meisterT · 2024-10-22T17:49:51Z

webapp/src/Controller/API/JudgehostController.php

+                'Inconsistent data, no judging known with judgingid = ' . $judgeTask->getJobId() . '.');
+        }
+        if ($tempFilename = tempnam($this->dj->getDomjudgeTmpDir(), "visual-")) {
+            $debug_package = base64_decode($request->request->get('visual_output'));


nit: update name (no longer debug_package)

meisterT · 2024-10-22T17:50:21Z

webapp/src/Controller/API/JudgehostController.php

+    #[OA\Response(response: 200, description: 'When the visual output has been added')]
+    public function addVisualization(
+        Request $request,
+        #[OA\PathParameter(description: 'The hostname of the judgehost that wants to add the debug info')]


nit: this is not debug info

meisterT · 2024-10-22T17:55:31Z

webapp/src/Entity/Visualization.php

+    'comment' => 'Team output visualization.',
+])]
+#[ORM\Index(columns: ['judgingid'], name: 'judgingid')]
+class Visualization


If we allow a visualizer to fail gracefully (see exit code 43 message above), we need to be able to record this here (and perhaps a message?).

meisterT · 2024-10-22T17:57:00Z

One implication of current implementation (which is reasonable) is that if you shut down a judgehost, you are not going to get the visualization. We should probably think about a good way to message this to the user.

meisterT · 2024-10-22T17:58:27Z

Out of scope for this PR, we could have a toggle at the problem level to visualize the output right after judging a test case (perhaps filtered down by verdict, e.g. only for wrong-answer).

vmcj · 2024-10-22T20:23:57Z

One implication of current implementation (which is reasonable) is that if you shut down a judgehost, you are not going to get the visualization. We should probably think about a good way to message this to the user.

That problem is also there for the debug package I think. It wouldn't be that hard to extend here to optionally calculate the result on the testcase but if a submission is non-deterministic we would get a different visualization as what the original team-output had. Your remark #2744 (comment) would prevent the problem but would give a problem when visualization takes a some time and there is a backlog.

vmcj · 2024-10-22T20:26:39Z

Out of scope for this PR, we could have a toggle at the problem level to visualize the output right after judging a test case (perhaps filtered down by verdict, e.g. only for wrong-answer).

I assume still with the priority of first the submission work and only if there is nothing do the visualization or do you want to extend the queued work to have a {Run+Visual}[x] with x as the length of the worklist?

vmcj and others added 30 commits October 13, 2024 14:21

Future work

a43209a

Store new visualizer type

d3eec0c

Make the doctrine links

cdc1692

This should be part of the ProblemZip

88c0e88

Such a script will most likely never be a generic script.

Add simple output_validator

131abb5

This is untested yet as the syntax should be further discussed.

Assume this will be put in the spec

5e3406f

This assumes a similar invocation as for the output_validator, the other alternative is to add this to the domjudge-problem.ini file if this doesn't end up in the ICPC problem spec.

Allow upload of the output_visualizer

3b40afa

First try on import

d4f4401

Give icon for new executable type

101cc03

Display the problem badge

69d99dc

Display on the executable page itself

df61e4a

Add button to generate the visualization

a956b58

Getting this directly via submission.problem.output_visualizer_executable (the property) seems to fail. It does show up in the twig dump but the translation fails when using it.

Show executable in problem page

fbe7358

Leave this for the demo

2986727

Remember if we already requested visualization

d79783d

Create the needed judgetasks

8de3e52

Fix PHPStan issues

f3b96e1

Fix syntax error

f6baa3a

Just move the whole code away for now

76deaaf

This is needed during testing of the job on the judgehost

And more errors as the needed code was commented out

aa062d7

Revert later - Fixate hostname for api/doc debugging

2963350

Fix API for when you test on a deeper dir

f0dcaec

New migrations

1636944

Test the search in an unrelated page

1af147a

Debugging via the API is hard, so we hardcode the search and just copy it over in a next commit.

Also get the new visualization JudgeTasks

d88d0c0

Return the visualizer jobs

e56a83f

Store the judgetask

3a05304

Working on the judgehost side

87ca226

Fix output to create the image

5d5d6c6

vmcj and others added 14 commits October 18, 2024 12:54

Update webapp/migrations/Version20241018061817.php

7190795

Update webapp/src/Controller/API/JudgehostController.php

c424d21

Update webapp/src/Controller/API/JudgehostController.php

fa3e2ed

Update webapp/src/Controller/API/JudgehostController.php

7641e17

Update webapp/src/Controller/Jury/UserController.php

5b9c970

Remove dump and debug statements

031d24a

Cleanup unused code

51d3cc3

Fix missing $run

d34fce3

Prevent duplication of code

d616057

Not used

ed5e47c

Fix W3C test

f9de23c

Fixup

f7e7e23

Add last forgotten code

28c3904

Fixup

2b6afcd

nickygerritsen reviewed Oct 19, 2024

View reviewed changes

meisterT reviewed Oct 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output vizualizer #2744

Output vizualizer #2744

vmcj commented Oct 14, 2024 •

edited

Loading

nickygerritsen Oct 19, 2024

nickygerritsen Oct 19, 2024

vmcj Oct 20, 2024

nickygerritsen Oct 19, 2024

nickygerritsen Oct 19, 2024

nickygerritsen Oct 19, 2024

vmcj Oct 19, 2024

nickygerritsen Oct 20, 2024

nickygerritsen Oct 19, 2024

meisterT commented Oct 22, 2024

meisterT Oct 22, 2024

vmcj Oct 22, 2024

meisterT Oct 22, 2024

meisterT Oct 22, 2024

meisterT Oct 22, 2024

vmcj Oct 22, 2024

meisterT Oct 22, 2024 •

edited

Loading

vmcj Oct 22, 2024

meisterT Oct 22, 2024

meisterT Oct 22, 2024

vmcj Oct 22, 2024

meisterT Oct 22, 2024

vmcj Oct 22, 2024

meisterT Oct 22, 2024

meisterT Oct 22, 2024

vmcj Oct 22, 2024

meisterT commented Oct 22, 2024

meisterT Oct 22, 2024

meisterT Oct 22, 2024

meisterT Oct 22, 2024

meisterT commented Oct 22, 2024

meisterT commented Oct 22, 2024

vmcj commented Oct 22, 2024

vmcj commented Oct 22, 2024

Output vizualizer #2744

Are you sure you want to change the base?

Output vizualizer #2744

Conversation

vmcj commented Oct 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meisterT commented Oct 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meisterT Oct 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meisterT commented Oct 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meisterT commented Oct 22, 2024

meisterT commented Oct 22, 2024

vmcj commented Oct 22, 2024

vmcj commented Oct 22, 2024

vmcj commented Oct 14, 2024 •

edited

Loading

meisterT Oct 22, 2024 •

edited

Loading