Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create unified tests #1341

Merged
merged 174 commits into from
Nov 1, 2023
Merged

Conversation

albinahlback
Copy link
Collaborator

@albinahlback albinahlback commented Apr 4, 2023

Note: New description

Change of code

Currently, all tests in FLINT is basically written in the same way, namely

/* Copyright notice for module/test/t-function.c */

#include "myheaders.h"

int main(void)
{
    mytype_t some_variables;
    FLINT_TEST_INIT(state);

    printf("module_function....");
    fflush(stdout);

    /* PERFORMING TESTS */

    FLINT_TEST_CLEANUP(state);
    printf("PASS\n");
    return 0;
}

To streamline this, we can do this instead:

/* Copyright notice for module/test/t-function.c */

#include "test_helpers.h"
#include "myheaders.h"

TEST_FUNCTION_START(module_function, state)
{
    mytype_t some_variables;

    /* PERFORMING TESTS */

    TEST_FUNCTION_END(state);
}

with a main function

/* Copyright notice for module/test/main.c */

/* Include system headers */

/* Include functions *********************************************************/

...
#include "t-function.c"
...

/* Array of test functions ***************************************************/

test_struct tests[] =
{
    ...
    TEST_FUNCTION(module_function),
    ...
};

/* main function *************************************************************/

TEST_MAIN(tests)

End result

The end result will be more streamlined tests along with a reduced compile time for the tests as only module/test/main.c has to be compiled, in contrast to compiling module/test/t-*.c separately. I suspect this will reduce the compile time by at least 50%, but I think one could expect a larger reduction in compile time for the tests.

Also, we should see a great reduction in the number of lines.

Functionality

In order to only run a subset of test functions, one can do this via ./build/fmpz/test/main fmpz_add fmpz_addmul, and in this example it will only run the test for fmpz_add and fmpz_addmul.

I do not intend to implement multithreaded testing for a single module in this PR, but it would be nice to have this. However, as of now it is not the main priority.

Bug fixes-ish

As I am going through all files (semi-automatically, just using Vim macros to change everything), I noticed that some functions print the wrong function it tests. All of these are being fixed.

Moreover, some test functions for IO creates a forks in the test functions. As the child processes are never closed, this would lead to multiple processes running the same tests after these IO-tests. Hence, I ensured that these child processes are closed, and that the parent waits until the child process has exited.

@fredrik-johansson
Copy link
Collaborator

I think you can remove a lot of boilerplate. The macro

TEST_FUNCTION(acb_acos)

could create a struct entry containing both the function pointer and the string, i.e.

{ test_acb_acos, "acb_acos" }

So you wouldn't need the separate array with

char acb_acos_name[] = "acb_acos";
...

char * test_names[] =
...

@fredrik-johansson
Copy link
Collaborator

Parallel testing for a single module is quite useful to keep. What are the options?

  • Build a test "program" for each function that is just a shell script invoking the main test
    program with its string.
  • When the main test program is called without an argument, have it spawn subprocesses instead of calling tests functions in a loop.
  • Hack the makefile to accomplish this some other way...?

@albinahlback
Copy link
Collaborator Author

I think you can remove a lot of boilerplate. The macro

TEST_FUNCTION(acb_acos)

could create a struct entry containing both the function pointer and the string, i.e.

{ test_acb_acos, "acb_acos" }

So you wouldn't need the separate array with

char acb_acos_name[] = "acb_acos";
...

char * test_names[] =
...

Thanks for your feedback! Yes, this sounds as the best possible alternative, will do this instead.

@albinahlback
Copy link
Collaborator Author

albinahlback commented Oct 16, 2023

Parallel testing for a single module is quite useful to keep. What are the options?

* Build a test "program" for each function that is just a shell script invoking the main test
  program with its string.

* When the main test program is called without an argument, have it spawn subprocesses instead of calling tests functions in a loop.

* Hack the makefile to accomplish this some other way...?

Just so it is out there, I think we can agree that parallell testing within each module when performing all tests throughout FLINT doesn't really matter, and would probably not be beneficial at all.

However, for one module only, I am thinking that a call like

make check MOD=fmpz [THREAD=1/NTHREADS=8]

could invoke threading via

./build/fmpz/test/main [THREAD=1/NTHREADS=8]

which would spawn threads of whom each works on a subset of tests. Wouldn't be hard to implement. Sounds good?

Edit: I will first check if it is possible to obtain the number of threads in Make, and so one wouldn't need to pass another parameter into Make, but simply use -j.

@fredrik-johansson
Copy link
Collaborator

fredrik-johansson commented Oct 16, 2023

We'd have to be careful about running multiple tests in the same process because this changes runtime behaviour: we will enter some tests with cached data (fmpzs, primes, arb constants, thread pool, etc.) whereas tests previously always started with a clean slate. Actually this can be a good thing to test too, but I don't know if it should be the default.

There is a particular problem with using multithreading to run tests in parallel: tests intended specifically to check multithreaded algorithms may end up only running serially because the thread pool is occupied by the competing test programs! (Again, it is also potentially interesting to deliberately test such conditions, but it should maybe not be the default.)

@albinahlback
Copy link
Collaborator Author

Okay, I would like to not parallelize the tests in this PR. Is that okay with you, @fredrik-johansson? That we can do later, should be easy.

@albinahlback albinahlback changed the title Sketch unified tests Create unified tests Oct 17, 2023
@fredrik-johansson
Copy link
Collaborator

It looks fine, but I would like to see parallel tests working from the start.

@albinahlback
Copy link
Collaborator Author

I think I have an idea of how to check a single module with multiple jobs. Let $N$ denote the number of jobs.

  1. In test_helpers.h, we modify TEST_MAIN so that it can print all possible tests if one puts in some option, say --print-tests.
  2. In Makefile, if $N > 1$ and MOD is set, we run a recipe that sets a variable, say MOD_TESTS, to all tests for module MOD.
  3. In Makefile, create a partition of MOD_TESTS into $N$ variables MOD_TESTS_1, ..., MOD_TESTS_$(N).
  4. In Makefile, we run $N$ recipes that checks module MOD with each partition.

The running recipes should look something like

ifdef MOD
define mod_recipe
$(BUILD_DIR)/$(MOD)/test/main_RUN_$(1): $(BUILD_DIR)/$(MOD)/test/main | $(BUILD_DIR)/$(MOD)/test
	@$< $(MOD_TESTS_$(1))
endef

$(foreach job_number, $(JOB_NUMBER_LIST), $(eval $(call mod_recipe,$(job_number))))
endif

@albinahlback
Copy link
Collaborator Author

@fredrik-johansson Is there a reason why many files contain an empty line at the end of the file?

@albinahlback albinahlback force-pushed the unifytests branch 2 times, most recently from fff800a to e5a69ed Compare October 18, 2023 12:17
@edgarcosta
Copy link
Member

For running the tests in parallel, I would outsource it to the GNU Parallel

@albinahlback
Copy link
Collaborator Author

For running the tests in parallel, I would outsource it to the GNU Parallel

I'd prefer if the makefile contains as much code as possible written in native Make, but if I cannot make it work I will consider this. Thanks!

@edgarcosta
Copy link
Member

In hindsight, I think you are right, native make sounds much more reasonable.

@albinahlback
Copy link
Collaborator Author

But I really want the runtime to be lower before merging. It could signal that something went wrong. For example, in the MSVC runner, nmod_mat seems to time out, which is weird.

@fredrik-johansson
Copy link
Collaborator

OK, parallel make -j check MOD=fmpz seems to work with make 4.4.

I consider that acceptable as long as make -j check still works with older versions.

@albinahlback
Copy link
Collaborator Author

OK, parallel make -j check MOD=fmpz seems to work with make 4.4.

I consider that acceptable as long as make -j check still works with older versions.

Looks like a lot of work was made on MAKEFLAGS in Make 4.4, so I suppose this will only work after 4.3.

@albinahlback
Copy link
Collaborator Author

Okay, on Linux with Make 4.4.1, I have performed some tests, and it seems like the tests on this branch indeed is faster when not running in parallel. However, for parallel runs, it is not always the case. The solution seems to be that one runs make -j$(nproc) check instead of make -j check.

One could alter the flags inside the Makefile, but that requires Make 4.4 I believe.

Will post some benchmarks soon.

@albinahlback
Copy link
Collaborator Author

Here are some benchmarks indicating that this branch is faster when running make -j`expr $(nproc) + 1` check as compared to any variant of make -j check for the trunk branch. I have a 4-threaded cpu, so the fastest for me is make -j5 check.

unifytests, make check:
real    9m30.998s
user    9m31.318s
sys     0m5.060s

unifytests, make -j2 check:
real    4m59.836s
user    9m51.389s
sys     0m5.527s

unifytests, make -j3 check:
real    4m36.397s
user    13m35.338s
sys     0m6.450s

unifytests, make -j4 check:
real    4m23.140s
user    16m16.404s
sys     0m7.509s

unifytests, make -j5 check:
real    4m20.459s
user    16m19.828s
sys     0m7.339s

unifytests, make -j6 check:
real    4m20.733s
user    16m16.791s
sys     0m6.972s

unifytests, make -j check:
real    4m48.183s
user    16m11.060s
sys     0m8.275s

trunk, make -j5 check:
real    4m39.209s
user    17m4.374s
sys     0m21.604s

trunk, make -j check:
real    4m42.549s
user    17m9.831s
sys     0m19.387s

@fredrik-johansson
Copy link
Collaborator

I get the same behavior:

time make -j9 check
...
real        1m3,334s
user        8m29,395s
sys         0m16,298s

Curious; any clue why it's like that?

@albinahlback
Copy link
Collaborator Author

I get the same behavior:

time make -j9 check
...
real        1m3,334s
user        8m29,395s
sys         0m16,298s

Curious; any clue why it's like that?

Nice that you had the same result! I remember seeing something about it on SE, but I don't know the reason behind it.

@albinahlback
Copy link
Collaborator Author

Finally. I think it is ready to be merged!

@fredrik-johansson
Copy link
Collaborator

It looks good. Before merging, I just want to see that test coverage hasn't changed.

Could you update the explanations https://flintlib.org/doc/building.html also? Document the make version requirements, how to set -jN correctly, how to test a single module, how to test a single function etc.

@fredrik-johansson
Copy link
Collaborator

The coverage changes reported by codecov are worth a look. For example, there seem to be functions in nmod.h and fmpz_mpoly_factor.h among others that lost coverage. This could just be some issue with the way codecov processes header files though.

@albinahlback
Copy link
Collaborator Author

Could you update the explanations https://flintlib.org/doc/building.html also? Document the make version requirements, how to set -jN correctly, how to test a single module, how to test a single function etc.

I pushed a commit for this.

@albinahlback
Copy link
Collaborator Author

The coverage changes reported by codecov are worth a look. For example, there seem to be functions in nmod.h and fmpz_mpoly_factor.h among others that lost coverage. This could just be some issue with the way codecov processes header files though.

Yeah, I really want to fix these things, but looking into it, it seems like there is no change. For example, fmpz_mpoly_factor_get_exp_si does not appear once in the trunk (apart from the header file), and not once in this branch. However, codecov now seems to think that it was uncovered...

@albinahlback
Copy link
Collaborator Author

Like, there may be a lot of bugs and so on in this PR, but I doubt that this PR will actually decrease the coverage.

@fredrik-johansson
Copy link
Collaborator

fredrik-johansson commented Nov 1, 2023

The local coverage report seems to be broken. Doing

/configure --enable-coverage --disable-static --enable-avx2
make -j
make -j check
make coverage

I get coverage reports for all the test main programs, not the actual library source code.

@albinahlback
Copy link
Collaborator Author

The local coverage report seems to be broken. Doing

/configure --enable-coverage --disable-static --enable-avx2
make -j
make -j check
make coverage

I get coverage reports for all the test main programs, not the actual library source code.

Does my commit work?

@fredrik-johansson
Copy link
Collaborator

That excludes the tests, but all .c files in the library are being ignored for some reason (I only get coverage reports for the header files).

@fredrik-johansson
Copy link
Collaborator

OK, that may just have been something wonky on my end. It works after I changed some search paths.

@fredrik-johansson
Copy link
Collaborator

OK, will merge this now. Nothing looks obviously wrong, and the CI speedup alone is worth it.

Thanks a lot!

@fredrik-johansson fredrik-johansson merged commit aa91536 into flintlib:trunk Nov 1, 2023
16 checks passed
@albinahlback albinahlback deleted the unifytests branch November 2, 2023 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants