Changes for https://github.com/antlr/grammars-v4/issues/2988 #3024

kaby76 · 2023-01-23T11:38:59Z

Statement of Problem

This PR addresses #2988.

Tests in the V4 repo parse one input file per program invocation, which I call individual parsing. But, we can save a good deal of time by parsing multiple input files per program invocation, called grouped parsing. Let's see what this looks like for testing the parse of input files AllInOne8.java, helloworld.java, and IdentifierTest.java for the Java9 grammar.

For individual parsing, one test is processed by one program call:

./bin/Debug/net6.0/Test.exe ../examples/AllInOne8.java
Time: 00:00:05.9106526
Parse succeeded.
./bin/Debug/net6.0/Test.exe ../examples/helloworld.java
Time: 00:00:00.5270543
Parse succeeded.
./bin/Debug/net6.0/Test.exe ../examples/IdentifierTest.java
Time: 00:00:00.6478677
Parse succeeded.

For grouped parsing, all tests are processed by one program call:

./bin/Debug/net6.0/Test.exe ../examples/AllInOne8.java ../examples/helloworld.java ../examples/IdentifierTest.java ...
CSharp 0 ../examples/AllInOne8.java success 6.1742676
CSharp 1 ../examples/helloworld.java success 0.0255556
CSharp 2 ../examples/IdentifierTest.java success 0.1810686

With grouped parsing, the time for parsing each successive input should be shorter relative to the time the input took with individual parsing. Internally, Antlr keeps a cache of prediction context between each parse, which grouped parsing exploits and results in a faster parse. Grouped parsing should help for Java, CSharp, and a few other targets. But, it may not help with all targets, especially PHP, which shows no speed-up, and is likely a bug. See antlr/antlr-php-runtime#36.

Requirements, Design, and Implementation

To implement grouped parsing, the newest version of Trash trgen will need to be employed, as well as the templates in this repo updated. This PR changes the CI build tests, and some of the .errors and .tree files (more on this explained below).

After a detailed reading and testing of the current _scripts/test.ps1 and templates, I came to the conclusion that there was a significant disconnect in the requirements, design, and implementation between the Bash and Powershell test scripts. Therefore, we need to state some requirements for these scripts and the generated scripts for trgen.

Requirements

The basic requirement for the Bash or Powershell test scripts is to provide a simple, self-contained method to build and test the code in a Bash or Powershell environment.

cd abb
trgen --template-sources-directory ../_scripts/templates
# Run Bash script to build then test.
bash build.sh
bash test.sh
# Or, run Powershell script to build then test.
pwsh build.ps1
pwsh test.ps1

The scripts should take minimal parameters so that people can focus on updating changes to a grammar rather than wasting time figuring out how to test the grammar across eight targets. (Note, to be implemented for Build is very slow, tests too much, outputs too much #2883.)
trgen will be used to generate scripts test.sh and test.ps1. The Bash and Powershell scripts should mirror each other step-by-step. The scripts should also mirror each other across targets, e.g., Java and Dart.
There should be scripts for building, testing, and cleaning up, both Bash and Powershell.
The top-level scripts regtest.sh and test.ps1 (replacing tester.ps1) should be incremental (taking into account git status), and runable from any directory (including a grammar, or a collection of grammars, like asm/, antlr, or sql/). (NB: This requirement will be addressed in Build is very slow, tests too much, outputs too much #2883.)

git clone https://github.com/antlr/grammars-v4.git
cd grammars-v4
# Run Bash tester from repo root.
bash _scripts/regtest.sh
# Run Powershell tester from repo root.
pwsh _scripts/test.ps1
# Run Bash tester on specific grammar.
cd abb
bash ../_scripts/regtest.sh
# Run Powershell tester on specific grammar (TO BE COMPLETED LATER).
pwsh ../_scripts/test.ps1

With grouped parsing, the parser driver program must generate .errors and .tree files. The calling script cannot capture the output through redirection as done previously because output for all inputs would then be grouped together. (The testing scripts must be compatible with the Antlr Maven test program.) NB: Careful attention is placed on capturing all parser output. Any output to stderr that is not for trees and standard parsing errors is considered an error in the execution of the parser. For example, this can happen especially with dynamically typed and parsed languages like JavaScript, Python3, and PHP in which the base class is not correct, and the interpreter throws an error.
git will be used to implement diffing of the .errors and .tree files. The file .gitignore must be updated to not track these files, unless the .errors or .tree file is explicitly tracked for regression testing. The .errors files will have a trailing newline after each syntax error message. The .tree files will not have a trailing newline after the one tree printed.

Discussion of requirements

It's hard to remember how to run the compiler and program of the parser program for a particular target. It's even worse trying to remember the syntax and runtimes for the targets. That's the whole point of trgen.

Unfortunately, tester.psm1, which is a templated file thus generated by trgen, is not a stand-alone Powershell script because the extension is ".psm1", not ".ps1". So, it is unusable by someone working on a grammar and just wanting to test the grammar across targets.

Further, the parent script "test.ps1" assumes the test input is always in a directory called "examples/". trgen reads the pom.xml to get the location of the input tests.

Since grouped parsing can't use Bash/Powershell capture of the parse tree--because there would be multiple trees captured for multiple input files--the driver program must place the output in a specific output file. But, then, how does the script check diffs of these parse tree files??

In order to implement parse tree file diffs, I use git diff to find those files that are changed. It's laborious to write code to do diffs in Bash and Powershell. In fact, the Powershell "diff" strips out newlines of a multiline output file contained in a string. Let's make the code simpler by just using "git". If there's no git, or the grammar has been removed from the rest of the cloned repo, then the tests should just default to returning any errors found, even if they are expexted.

The .errors files seem to have extra newlines at the end. Extremely annoying. I don't know if they are hand-edited or the drivers that created them include extra printf's for the hell of it. In any case, I remastered a number of these files with this PR in order for the tests to pass.

From this point forward, test.sh or test.ps1 should be used to remaster all .tree and .errors files, as pondered in this comment.

Results

Although not scientific, I compared one testing of the V4 grammars with individual parsing vs grouped parsing. On an unloaded Windows 11 machine (Ryzen 7 2700, 8-cores, DDR4 16GB, SanDisk SDSSDH3 1TB, Antlr4.11.1, NET SDK 7.0.101), individual parsing of the CSharp target completed in 1h 44m, grouped parsing completed in 1h 11m. That is significantly faster.

Conclusions

This is an important PR. Grouped parsing helps a lot in the parse time for the build. But, overall, the main bottleneck in the build is the compilation of the generated code. It really goes back to #2883. The "incremental" testing isn't really working because, somehow, the information on what specifically to test is ignored. In addition, we shouldn't be testing combined grammars because they don't have target-specific action code.

…stainable. All .errors and .trees that are tracked in Git repo must be generated exactly from the program consistently. Errors in the .errors files must have exactly one newline terminating the error. The parse tree in the .tree file does not have a newline terminating the tree.

…gnore causes any diffs in the tracked to be ignored as well, completely defeating the whole purpose. Makes absolutely zero sense writers of Git because .gitignore is a "broad brush" setting, and tracking a specific file ***should be*** a fine brush overriding the broad brush.

…s file changed last, cause massive failure in github actions. Push something to test again.

kaby76 · 2023-01-28T10:43:12Z

Massive failure in Github Actions with 4-line change to .gitignore file. No information from Github Actions. No output, no data, no reasons. This system is garbage really written by bozos.

kaby76 · 2023-01-28T10:57:03Z

@teverett @KvanTTT

Folks, There is something quite wrong with Github Actions. The last two actions for this PR failed with zero output. Can someone look into why it is failing?

kaby76 · 2023-01-28T12:04:34Z

I finally see after control-minus-sign to zoom out of the page in the browser: "The job was not started because recent account payments have failed or your spending limit needs to be increased. Please check the 'Billing & plans' section in your settings."

I need to get this PR, #3024, and the next one for #2883 in as soon as possible. The build is not sustainable because it chews up way, way too much computing time. The builds must be change to an increment build. This PR, #3024, changes the tests to use less compute time in parsing. The next PR will change to build only the grammars that have changed.

teverett · 2023-01-28T16:16:09Z

@teverett @KvanTTT

Folks, There is something quite wrong with Github Actions. The last two actions for this PR failed with zero output. Can someone look into why it is failing?

teverett · 2023-01-28T16:18:33Z

@teverett @KvanTTT
Folks, There is something quite wrong with Github Actions. The last two actions for this PR failed with zero output. Can someone look into why it is failing?

I presume we reached the compute-time limit for open source projects. @parrt can you confirm?

I suggest we wait a couple days since its Jan 28. It'll reset, and we'll pull PR #3024. @kaby76 @KvanTTT what do you think?

kaby76 · 2023-01-28T16:22:16Z

@teverett That sounds fine.

teverett · 2023-01-28T16:26:24Z

@kaby76 does PR #3024 mean we retire antlr4test-maven-plugin?

kaby76 · 2023-01-28T16:54:54Z

@teverett Up to you. Whatever you think. --Ken

teverett · 2023-01-30T19:24:18Z

Does the maven plugin add any value that this PR doesn't? If not, could you update the PR to remove the maven plugin from the build?

kaby76 · 2023-01-30T22:29:11Z

Does the maven plugin add any value that this PR doesn't? If not, could you update the PR to remove the maven plugin from the build?

Hi Tom, I'm going to turn off the Maven build because it isn't git diff aware (so it can test only grammars that change) and focus on the other two. At some point, we can probably drop one of those, too, since the Powershell and Bash tester codes are almost identical line for line with the PR for #2883.

teverett · 2023-02-04T15:25:22Z

@kaby76 thanks!

Changes for #2988

b598ccf

kaby76 marked this pull request as draft January 23, 2023 13:48

kaby76 marked this pull request as ready for review January 23, 2023 17:10

teverett added the infrastructure label Jan 24, 2023

kaby76 added 3 commits January 24, 2023 11:47

Renamed interfering Test.java sources--trgen much better tester.

a570cab

Build failed with absolutely no output, no data on why it failed. Thi…

494d45f

…s file changed last, cause massive failure in github actions. Push something to test again.

kaby76 marked this pull request as draft January 28, 2023 10:41

teverett closed this Jan 28, 2023

teverett reopened this Jan 28, 2023

kaby76 marked this pull request as ready for review January 28, 2023 20:26

Remove mvn builds.

8715d6d

teverett merged commit 20f5c83 into antlr:master Feb 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes for https://github.com/antlr/grammars-v4/issues/2988 #3024

Changes for https://github.com/antlr/grammars-v4/issues/2988 #3024

kaby76 commented Jan 23, 2023 •

edited

Loading

kaby76 commented Jan 28, 2023

kaby76 commented Jan 28, 2023 •

edited

Loading

kaby76 commented Jan 28, 2023 •

edited

Loading

teverett commented Jan 28, 2023

teverett commented Jan 28, 2023

kaby76 commented Jan 28, 2023

teverett commented Jan 28, 2023

kaby76 commented Jan 28, 2023

teverett commented Jan 30, 2023

kaby76 commented Jan 30, 2023

teverett commented Feb 4, 2023

Changes for https://github.com/antlr/grammars-v4/issues/2988 #3024

Changes for https://github.com/antlr/grammars-v4/issues/2988 #3024

Conversation

kaby76 commented Jan 23, 2023 • edited Loading

Statement of Problem

Requirements, Design, and Implementation

Requirements

Discussion of requirements

Results

Conclusions

kaby76 commented Jan 28, 2023

kaby76 commented Jan 28, 2023 • edited Loading

kaby76 commented Jan 28, 2023 • edited Loading

teverett commented Jan 28, 2023

teverett commented Jan 28, 2023

kaby76 commented Jan 28, 2023

teverett commented Jan 28, 2023

kaby76 commented Jan 28, 2023

teverett commented Jan 30, 2023

kaby76 commented Jan 30, 2023

teverett commented Feb 4, 2023

kaby76 commented Jan 23, 2023 •

edited

Loading

kaby76 commented Jan 28, 2023 •

edited

Loading

kaby76 commented Jan 28, 2023 •

edited

Loading