Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1pt] PR: Cut down Alaska HUCs runtime #1327

Merged
merged 9 commits into from
Nov 1, 2024
Merged

Conversation

ZahraGhahremani
Copy link
Contributor

@ZahraGhahremani ZahraGhahremani commented Oct 22, 2024

The purpose of this PR is to cut down the runtime for four Alaska HUCs (19020104, 19020503, 19020402 , and 19020602). It significantly optimizes runtime by replacing a nested for loop, used for updating rating curve for small segments, with a vectorized process. This changes were applied only to the Alaska HUCs.
As part of this PR, small modification was applied to bridge_inundation.py.

Changes

  • src/add_crosswalk.py
  • src/delineate_hydros_and_produce_HAND.sh
  • tools/bridge_inundation.py

Testing

Tested for HUCS 19020104, 19020503, 19020402, and 05030104.

Deployment Plan (For developer use)

How does the changes affect the product?

  • Code only?
  • If applicable, has a deployment plan be created with the deployment person/team?
  • Require new or adjusted data inputs? Does it have start, end and duration code (in UTC)?
  • If new or updated data sets, has the FIM code been updated and tested with the new/adjusted data (subset is fine, but must be a subset of the new data)?
  • Require new pre-clip set?
  • Has new or updated python packages?

Issuer Checklist (For developer use)

You may update this checklist before and/or after creating the PR. If you're unsure about any of them, please ask, we're here to help! These items are what we are going to look for before merging your code.

  • Informative and human-readable title, using the format: [_pt] PR: <description>
  • Links are provided if this PR resolves an issue, or depends on another other PR
  • If submitting a PR to the dev branch (the default branch), you have a descriptive Feature Branch name using the format: dev-<description-of-change> (e.g. dev-revise-levee-masking)
  • Changes are limited to a single goal (no scope creep)
  • The feature branch you're submitting as a PR is up to date (merged) with the latest dev branch
  • pre-commit hooks were run locally
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • CHANGELOG updated with template version number, e.g. 4.x.x.x
  • Add yourself as an assignee in the PR as well as the FIM Technical Lead

Merge Checklist (For Technical Lead use only)

  • Update CHANGELOG with latest version number and merge date
  • Update the Citation.cff file to reflect the latest version number in the CHANGELOG
  • If applicable, update README with major alterations

@ZahraGhahremani ZahraGhahremani self-assigned this Oct 23, 2024
@ZahraGhahremani ZahraGhahremani added the enhancement New feature or request label Oct 23, 2024
@ZahraGhahremani ZahraGhahremani linked an issue Oct 23, 2024 that may be closed by this pull request
@ZahraGhahremani ZahraGhahremani changed the title Modify Alaska HUCs runtime [1pt] PR: Cut down Alaska HUCs runtime Oct 23, 2024
@RobHanna-NOAA
Copy link
Contributor

Interesting. I took the slowest 18 HUCs from the hand_4_5_11_1 run which was done in AWS Step functions. I made a new temp huc list using the same 18, then run it in AWS as well. Overall time dropped significantly with longest huc dropping from 10.7 hrs to 3.5. Many dropped in processing time, but some increased. I will now run another test using a full UAT list and compare it to the previous UAT run which was hand_4_5_10_0 for direct comparison.

image

@RobHanna-NOAA
Copy link
Contributor

During a UAT set test, compared to 4.5.10.0 which was the latest UAT set, it continued to show some strange data. Not sure if it is a problem but seems odd to me. I have attached the results of both this UAT run and the previous smaller scale sets.

Notes:

  • These were run against AWS Step functions so are not directly comparable to times when run directly against UAT runs.
  • It is good to remember that with any run, times are not always consistant run to run. It depends on memory, CPU usage, disk usages, etc, but trends shoudl be very comparable (again.. as long as you are not comparing a AWS Step function run to a Thor Prod run.
  • The logging system for timing is not perfect and not all hucs successfully record to the duration log sucessful. Massaging of that list needs be done with each comparison.
  • After we talk, we might do full BED run compared to the last BED which was 4.5.11.1 and see what that tell us.

dev-runtime-alaska-compare-stats.xlsx

@RobHanna-NOAA
Copy link
Contributor

After talking to Zahra, we decided to do a special UAT run against the code for 4.5.11.1. Interesting results:

  • Overall 4.5.11.1 was noticeably different from 4.5.10.0 with each huc being an average of 1.44 mins slower to process (21.84 - 20.40)
    Uploading dev-runtime-alaska-compare-stats.xlsx…

  • Runtime AK was a little slower yet from 4.5.11.1 with each huc slowing down another 0.47 mins (22.32 - 21.84)

  • Unclear why 2/3 of the hucs (on average) were slower and only 1/3 faster.

Zahra and I decided that maybe applying the fix to just Alaska Hucs is best which saw major gains with some 2 - 3 times faster. The slowest huc was 10.71 hrs to process, now down to 3.5 hrs which is perfectly acceptable in my opinion based on our AWS sTep function configuration which is not speed optimized per huc.

Copy link
Contributor

@RobHanna-NOAA RobHanna-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code tested well. Update the PR text and Changlog for the files changed and I can approve it.

Copy link
Contributor

@RobHanna-NOAA RobHanna-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and fix linting please. :)

RobHanna-NOAA
RobHanna-NOAA previously approved these changes Oct 30, 2024
@ZahraGhahremani
Copy link
Contributor Author

and fix linting please. :)

All done! Thank you Rob!

RobHanna-NOAA
RobHanna-NOAA previously approved these changes Oct 31, 2024
@CarsonPruitt-NOAA CarsonPruitt-NOAA merged commit 3acec5e into dev Nov 1, 2024
1 check passed
@CarsonPruitt-NOAA CarsonPruitt-NOAA deleted the dev-runtime-ak branch November 1, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[13pt] Abnormal long runtimes for some Alaska HUCs
3 participants