Skip to content

Commit

Permalink
v4.4.10.1 Fix return error status from pre-processing and remove CPU …
Browse files Browse the repository at this point in the history
…test (#1075)
  • Loading branch information
RobHanna-NOAA authored Feb 16, 2024
1 parent a14e5e1 commit b7f8138
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 12 deletions.
17 changes: 17 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,23 @@
All notable changes to this project will be documented in this file.
We follow the [Semantic Versioning 2.0.0](http://semver.org/) format.

## v4.4.10.1 - 2024-02-16 - [PR#1075](https://github.com/NOAA-OWP/inundation-mapping/pull/1075)

We recently added code to fim_pre_processing.sh that checks the CPU count. Earlier this test was being done in post-processing and was killing a pipeline that had already been running for a while.

Fix:
- Removed the CPU test from pre-processing. This puts us back to it possibly failing in post-processing but we have to leave it for now.
- Exit status codes (non 0) are now returned in pre-processing and post-processing when an error has occurred.

Tested that the a non zero return exit from pre-processing shuts down the AWS step functions.

### Changes
- `fim_pre_processing.sh`: added non zero exit codes when in error, plus removed CPU test
- `fim_post_processing.sh`: added non zero exit codes when in error

<br/><br/>


## v4.4.10.0 - 2024-02-02 - [PR#1054](https://github.com/NOAA-OWP/inundation-mapping/pull/1054)

Recent testing exposed a bug with the `acquire_and_preprocess_3dep_dems.py` script. It lost the ability to be re-run and look for files that were unsuccessful earlier attempts and try them again. It may have been lost due to confusion of the word "retry". Now "retry" means restart the entire run. A new flag called "repair" has been added meaning fix what failed earlier. This is a key feature it is common for communication failures when calling USGS to download DEMs. And with some runs taking many hours, this feature becomes important.
Expand Down
2 changes: 2 additions & 0 deletions fim_post_processing.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ if [ "$runName" = "" ]
then
echo "ERROR: Missing -n run time name argument"
usage
exit 22
fi

outputDestDir=$outputsDir/$runName
Expand Down Expand Up @@ -217,6 +218,7 @@ Tcount
date -u

find $outputDestDir -type d -exec chmod -R 777 {} +
find $outputDestDir -type f -exec chmod -R 777 {} +

echo
echo "++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++"
Expand Down
20 changes: 8 additions & 12 deletions fim_pre_processing.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ usage()
-skipcal : If this param is included, the S.R.C. will be updated via the calibration points.
will be skipped.
"
exit
}

set -e
Expand Down Expand Up @@ -104,16 +103,20 @@ in
shift
done

# exit 22 means bad argument

# print usage if arguments empty
if [ "$hucList" = "" ]
then
echo "ERROR: Missing -u Huclist argument"
usage
exit 22
fi
if [ "$runName" = "" ]
then
echo "ERROR: Missing -n run time name argument"
usage
exit 22
fi

# outputsDir & workDir come from the Dockerfile
Expand All @@ -137,6 +140,7 @@ then
# NONE is not case sensitive
echo "Error: The -ud <unit deny file> does not exist and is not the word NONE"
usage
exit 22
fi

# validate and set defaults for the deny lists
Expand All @@ -148,6 +152,7 @@ then
# NONE is not case sensitive
echo "Error: The -bd <branch deny file> does not exist and is not the word NONE"
usage
exit 22
fi

# We do a 1st cleanup of branch zero using branchZeroDenylist (which might be none).
Expand All @@ -164,6 +169,7 @@ then
then
echo "Error: The -zd <branch zero deny file> does not exist and is not the word NONE"
usage
exit 22
else
# only if the deny branch zero has been overwritten and file exists
has_deny_branch_zero_override=1
Expand All @@ -178,17 +184,7 @@ if [ -d $outputDestDir ] && [ $overwrite -eq 0 ]; then
echo "ERROR: Output dir $outputDestDir exists. Use overwrite -o to run."
echo
usage
fi

# Test to ensure we are not overuseing cores
num_available_cores=$(echo $(grep -c processor /proc/cpuinfo))
let total_requested_jobs=$jobHucLimit*$jobBranchLimit
if [[ $total_requested_jobs -gt $num_available_cores ]]; then
echo
echo "ERROR: There are $num_available_cores available, but -jh (jobHucLimit) * -jb (jobBranchLimit)"\
"exceed the number of available cores"
echo
usage
exit 22
fi

## SOURCE ENV FILE AND FUNCTIONS ##
Expand Down

0 comments on commit b7f8138

Please sign in to comment.