Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nemolite2d difference between PSyclone generated and manual Fortran implementation #29

Open
sergisiso opened this issue May 21, 2020 · 4 comments
Assignees

Comments

@sergisiso
Copy link
Collaborator

The manual Fortran implementation uses ssha%grid%subdomain%global%[nx|nx] to calculate the loop boundaries of each kernel while PSyclone generated PSy-layer uses ssha%grid%subdomain%internal%[xstop|ystop].

If I print them in a serial execution of 1024x1024, e.g.:

write(*,*) ssha%grid%subdomain%global%nx, ssha%grid%subdomain%internal%xstop
write(*,*) ssha%grid%subdomain%global%ny, ssha%grid%subdomain%internal%ystop

I get:

        1026        1025
        1026        1025

Implying that the manual version is doing an extra row and column of work. But final ua/uv checksums are not affected:

Implementation               ua checksum    uv checksum    time/step
psyclone_generated   gcc cpu 0.52670513E+02 0.57100321E+03 0.32245E+00
nemolite2d           gcc cpu 0.52670513E+02 0.57100321E+03 0.19181E+00

I believe PSyclone does the right thing, but before making the change, @arporter can you confirm that this makes sense? and also has subdomain%global% the expected value or is there something that may be wrong in dl_esm_inf?

@LonelyCat124
Copy link
Collaborator

LonelyCat124 commented May 21, 2020

Do you still get the correct results @ further timesteps? I had some errors only showing at ~12k steps.

Also how accurate is the checksum?

@arporter
Copy link
Member

@sergisiso I think that's right. global%nx/y should be the the total number of points in the dimension whereas internal%xstop is the upper bound for the internal/simulated points. I would expect the PSyclone version to be correct whereas the manual version is more likely to have (accidentally) got out of sync.

@arporter
Copy link
Member

The checksum is I guess a fairly rough and ready guide but in my experience it is sufficient to show when things are right and when they're not. 12K steps is a lot! I never regularly run for that long so can't comment. For a small domain (~100x100) you've pretty much exercised all the physics by about 6K steps (i.e. incoming waves have reached the end and been reflected back).

@sergisiso sergisiso self-assigned this May 29, 2020
@arporter
Copy link
Member

arporter commented Jan 6, 2022

@sergisiso can we close this one now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants