Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current CESM test failures #76

Open
mnlevy1981 opened this issue May 10, 2023 · 0 comments
Open

Current CESM test failures #76

mnlevy1981 opened this issue May 10, 2023 · 0 comments

Comments

@mnlevy1981
Copy link
Collaborator

(given the impending move from POP -> MOM6, I don't expect to fix these; opening an issue ticket in case I get asked about testing in the future)

Description of the issue:

Some tests are failing on cheyenne with gfortran and DEBUG=TRUE (but not all tests in that configuration). With cesm2_3_beta12 the only test that fails is

SMS_Ld2_P80_D.T62_g37.C1850ECO.cheyenne_gnu.pop-ecosys_81blocks_100x116_spacecurve

I updated from MARBL from marbl0.40.3 to marbl0.41.0 (which required small POP changes as well) and two tests failed:

ERS_Ld5_D.T62_g37.C1850ECO.cheyenne_gnu.pop-ecosys_box_atm_co2
SMS_Ld2_P80_D.T62_g37.C1850ECO.cheyenne_gnu.pop-ecosys_81blocks_100x116_spacecurve

Moving to marbl0.42.0 (also making minor changes to POP) had a slightly different pair of failed tests

SMS_Ld2_D.T62_g37.C1850ECO.cheyenne_gnu.pop-ciso_daily_r4_tavg
SMS_Ld2_P80_D.T62_g37.C1850ECO.cheyenne_gnu.pop-ecosys_81blocks_100x116_spacecurve

And moving to the version of MARBL in marbl-ecosys/MARBL#423 was the same

SMS_Ld2_D.T62_g37.C1850ECO.cheyenne_gnu.pop-ciso_daily_r4_tavg
SMS_Ld2_P80_D.T62_g37.C1850ECO.cheyenne_gnu.pop-ecosys_81blocks_100x116_spacecurve

The traceback for each failed test is the same, pointing at something in the tidal mixing module:

51:
51:Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
51:
51:Backtrace for this error:
51:#0  0x2ad3855b7bff in ???
51:#1  0x67c993 in __tidal_mixing_MOD_init_tidal_mixing1
51:     at $EXEROOT/ocn/source/tidal_mixing.F90:919
51:#2  0x8e1191 in __initial_MOD_pop_init_phase1
51:     at $EXEROOT/ocn/source/initial.F90:386
51:#3  0x586e09 in initializerealize
51:     at $EXEROOT/ocn/source/ocn_comp_nuopc.F90:389
51:#4  0x2ad38001705b in _ZN5ESMCI6FTable12callVFuncPtrEPKcPNS_2VMEPi
51:     at /glade/p/cesmdata/cseg/PROGS/build/63684/esmf-8.5.0b19/src/Superstructure/Component/src/ESMCI_FTable.C:2167
51:#5  0x2ad380014198 in ESMCI_FTableCallEntryPointVMHop
51:     at /glade/p/cesmdata/cseg/PROGS/build/63684/esmf-8.5.0b19/src/Superstructure/Component/src/ESMCI_FTable.C:824
51:#6  0x2ad3803e7250 in _ZN5ESMCI3VMK5enterEPNS_7VMKPlanEPvS3_
51:     at /glade/p/cesmdata/cseg/PROGS/build/63684/esmf-8.5.0b19/src/Infrastructure/VM/src/ESMCI_VMKernel.C:2320
51:#7  0x2ad38040150c in _ZN5ESMCI2VM5enterEPNS_6VMPlanEPvS3_
51:     at /glade/p/cesmdata/cseg/PROGS/build/63684/esmf-8.5.0b19/src/Infrastructure/VM/src/ESMCI_VM.C:1216

Running the same test on izumi, however, tells a different story

Runtime Error: *** Arithmetic exception: Floating divide by zero Runtime Error: - aborting
$SRCROOT/components/cmeps/cime_config/../cesm/flux_atmocn/shr_flux_mod.F90, line 331: Error occurred in SHR_FLUX_MOD:FLUX_ATMOCN
$SRCROOT/components/cmeps/cime_config/../mediator/med_phases_aofluxes_mod.F90, line 1047: Called by MED_PHASES_AOFLUXES_MOD:MED_AOFLUXES_UPDATE
$SRCROOT/components/cmeps/cime_config/../mediator/med_phases_aofluxes_mod.F90, line 315: Called by MED_PHASES_AOFLUXES_MOD:MED_PHASES_AOFLUXES_RUN
$SRCROOT/components/cmeps/cime_config/../cesm/driver/esmApp.F90, line 141: Called by ESMAPP
[i039.cgd.ucar.edu:mpi_rank_39][error_sighandler] Caught error: Aborted (signal 6)

Version:

  • CESM: cesm2_3_beta12
  • POP2: cesm_pop_2_1_20230209

Machine/Environment Description:

cheyenne (gfortran) and izumi (nag)

Any xml/namelist changes or SourceMods:

no

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant