Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regression with supporting operator managed drivers #196

Merged
merged 1 commit into from
Oct 29, 2024

Conversation

klueska
Copy link
Collaborator

@klueska klueska commented Oct 29, 2024

Testing on a DGX-A100 node with operator managed driver:

export NVIDIA_CTK_PATH=/usr/local/nvidia/toolkit/nvidia-ctk
export NVIDIA_DRIVER_ROOT=/run/nvidia/driver
helm upgrade -i --create-namespace --namespace nvidia nvidia-dra-driver deployments/helm/k8s-dra-driver \
    ${NVIDIA_CTK_PATH:+--set nvidiaCtkPath=${NVIDIA_CTK_PATH}} \
    ${NVIDIA_DRIVER_ROOT:+--set nvidiaDriverRoot=${NVIDIA_DRIVER_ROOT}} \
    --wait
kubectl apply -f demo/specs/quickstart/gpu-test6.yaml
$ kubectl get pod -n gpu-test6
NAME                  READY   STATUS    RESTARTS   AGE
pod-9cc5685d7-2xd9j   1/1     Running   0          33s
pod-9cc5685d7-grn5p   1/1     Running   0          33s
pod-9cc5685d7-q958c   1/1     Running   0          33s
pod-9cc5685d7-zbs74   1/1     Running   0          33s

@klueska klueska merged commit 737b4c5 into NVIDIA:main Oct 29, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant