Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clubb_intr GPUization #1175

Open
wants to merge 12 commits into
base: cam_development
Choose a base branch
from

Conversation

huebleruwm
Copy link

This only modifies clubb_intr.F90 and doesn't require a new verseion of clubb. The purpose of this is the addition of acc directives, added in order to offload computations to GPUs. Besides the directives, this mainly consists of replacing vector notation with explicit loops, combining loops with the same bounds where possible, and moving non-gpuized function calls to outside of the GPU section. I also added some new notation for the number of vertical levels (nzm_clubb and nzt_clubb) that improves readability and will make it easier to merge in with future versions of clubb. I also included some timing statements, similar to the ones added in the Earthworks ew-develop branch, which this version of clubb_intr is also compatible with.

This is BFB on CPUs (tested with intel), runs with intel+debugging, and passes the ECT test when comparing CPU results to GPU results (using cam7). There's some options that I didn't GPUize or test (do_clubb_mf, do_rainturb, do_cldcool, clubb_do_icesuper, single_column ), so I left the code for them untouched and added some checks to stop the run if they're set when the code is compiled with OpenACC.

If there ends up being something wrong with these changes then this version, which is an earlier commit that contains only a new OpenACC data statement and some timer additions, would be nice to get in at least.

@Katetc Katetc self-requested a review October 21, 2024 21:43
Copy link
Collaborator

@nusbaume nusbaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I had some questions and change requests but none of them are required, and of course if you have any concerns with any of my requests then just let me know. Thanks!

call init_pdf_implicit_coefs_terms_api( pverp+1-top_lev, ncol, sclr_dim, &
pdf_implicit_coefs_terms_chnk(lchnk) )
end if
! Initialize physics tendency arrays, copy the state to state1 array to use in this routine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might change this comment to just say:

Suggested change
! Initialize physics tendency arrays, copy the state to state1 array to use in this routine
! Initialize physics tendency arrays

As you have another comment below in the location where you are copying the state.

Comment on lines +2769 to +2772
! Determine number of vertical levels used in clubb, thermo variables are nzt_clubb
! and momentum variables are nzm_clubb
nzt_clubb = pver + 1 - top_lev
nzm_clubb = pverp + 1 - top_lev
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this will impact the GPU-performance, but from a readability standpoint I wonder if it makes sense to make these two quantities module-level variables, set them once at the start of clubb_ini_cam, similar to where nlev was, and then use them everywhere you are otherwise doing pver + 1 - top_lev and pverp + 1 - top_lev.

This would also be beneficial because these quantities will never actually change during a CAM run, so they only need to be set once.

wp3(1:ncol,pverp) = wp3(1:ncol,pver)
up2(1:ncol,pverp) = up2(1:ncol,pver)
vp2(1:ncol,pverp) = vp2(1:ncol,pver)
! Initialize the apply_const variable (note special logic is due to eularian backstepping)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo:

Suggested change
! Initialize the apply_const variable (note special logic is due to eularian backstepping)
! Initialize the apply_const variable (note special logic is due to eulerian backstepping)

! to zero.
fcor(:) = 0._r8
! Set the ztodt timestep in pbuf for SILHS
ztodtptr(:) = 1.0_r8*hdtime
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a question for @Katetc, but why multiply by one instead of just using hdtime directly?

Comment on lines +2863 to +2864
!$acc state1, state1%q, state1%u, state1%v, state1%t, state1%pmid, state1%s, state1%pint, &
!$acc state1%zm, state1%zi, state1%pdeldry, state1%pdel, state1%omega, state1%phis, &
Copy link
Collaborator

@nusbaume nusbaume Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious as to why you have to bring in state1 in addition to the relevant state1 variables (e.g. state1%q)?


! Compute exner at the surface for converting the sensible heat fluxes
! to a flux of potential temperature for use as clubb's boundary conditions
inv_exner_clubb_surf(i) = 1._r8 / ( ( state1%pmid(i,pver) / p0_clubb )**( rairv(i,pver,lchnk) / cpairv(i,pver,lchnk) ) )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you just do this instead (?):

Suggested change
inv_exner_clubb_surf(i) = 1._r8 / ( ( state1%pmid(i,pver) / p0_clubb )**( rairv(i,pver,lchnk) / cpairv(i,pver,lchnk) ) )
inv_exner_clubb_surf(i) = inv_exner_clubb(i,pver)

Of course if this changes answers on CPUs then feel free to ignore this request.

do i=1, ncol
clubbtop(i) = top_lev
do while ((rtp2(i,clubbtop(i)) <= 1.e-15_r8 .and. rcm(i,clubbtop(i)) == 0._r8) .and. clubbtop(i) < pver)
clubbtop(i) = clubbtop(i) + 1
clubbtop(i) = clubbtop(i) + 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the indent here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants