-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: kruize: gpus not allocatable #782
Comments
|
@schwesig we should also have kruize check their clusterPolicy for errors |
@schwesig you might check if the |
@schwesig If my understanding is correct, I feel the node got restarted and I see the config we added to the default mig config map got deleted (expecting that nvidia gpu operator has rewritten the config map with default). Node wrk-5 has the label mig.config set to the custom kruize config (which is not present due to rewrite) so Ideally the mig config manager should choose the default setting (all-disabled) in case of missing the desired config in the config map. But with some reason it haven't happened. |
|
fyi: tried
|
Motivation
I see that the pods are not getting launched due to insufficient GPU resources
CrashLoopBackOff
Completion Criteria
Description
Completion dates
Desired - 2024-10-23
Required - 2024-10-25
Involved
@schwesig
@shekhar316
@bharathappali
@tssala23
@dystewart
maybe/FYI
The text was updated successfully, but these errors were encountered: