-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add memory usage variables for use on derecho #190
base: main
Are you sure you want to change the base?
Conversation
@@ -13,6 +13,8 @@ | |||
<BATCH_SYSTEM>pbs</BATCH_SYSTEM> | |||
<SUPPORTED_BY>cseg</SUPPORTED_BY> | |||
<MAX_TASKS_PER_NODE>128</MAX_TASKS_PER_NODE> | |||
<MEM_PER_TASK>10</MEM_PER_TASK> | |||
<MAX_MEM_PER_NODE>235</MAX_MEM_PER_NODE> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think MAX_MEM_PER_NODE
can increase to 470 for a GPU node on Derecho. Maybe add the gpu_type="!none"
attribute for the value of a GPU node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have to fine tune this for gpu nodes - currently mem usage on gpu nodes is hardcoded to 470 and this PR won't change that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that the memory usage on a GPU node is hardcoded to 480 now. I just wonder if it can be replaced by the MAX_MEM_PER_NODE
variable here as well, but a different value based on whether it is a CPU or GPU node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it can - is it an issue we need to worry about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it is an issue to worry about and we can address it later if it becomes a problem.
@@ -13,6 +13,8 @@ | |||
<BATCH_SYSTEM>pbs</BATCH_SYSTEM> | |||
<SUPPORTED_BY>cseg</SUPPORTED_BY> | |||
<MAX_TASKS_PER_NODE>128</MAX_TASKS_PER_NODE> | |||
<MEM_PER_TASK>10</MEM_PER_TASK> | |||
<MAX_MEM_PER_NODE>235</MAX_MEM_PER_NODE> | |||
<MAX_GPUS_PER_NODE>4</MAX_GPUS_PER_NODE> | |||
<MAX_MPITASKS_PER_NODE>128</MAX_MPITASKS_PER_NODE> | |||
<MAX_CPUTASKS_PER_GPU_NODE>64</MAX_CPUTASKS_PER_GPU_NODE> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think GPU_TYPE
, GPU_OFFLOAD
and MPI_GPU_WRAPPER_SCRIPT
have been removed from my last PR. Do you need to merge the latest main branch first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It hasn't been removed - it wasn't in this PR and will be included in the merge.
Adds two new variables for memory usage control on derecho:
MEM_PER_TASK and MAX_MEM_PER_NODE.