Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

定义50 的虚拟卡,却能分配大于2个pod到一张卡上 #30

Open
weapons97 opened this issue Sep 26, 2024 · 5 comments
Open

定义50 的虚拟卡,却能分配大于2个pod到一张卡上 #30

weapons97 opened this issue Sep 26, 2024 · 5 comments

Comments

@weapons97
Copy link

weapons97 commented Sep 26, 2024

如下定义8个pod, 每个pod 分配volcano.sh/vgpu-cores: "50", 但是进入检查pod gpu bus-id,看到6个pod都在同一个卡上

apiVersion: apps/v1
kind: Deployment
metadata:
  name: xxx
spec:
  replicas: 8
  selector:
    matchLabels:
      app: cnn
  template:
    metadata:
      labels:
        app: cnn
    spec:
      schedulerName: volcano
          resources:
            limits:
              # volcano.sh/vgpu-cores: "100"
              # volcano.sh/vgpu-memory: "20000"
              # volcano.sh/vgpu-number: "1"
              # nvidia.com/gpu: 1
              volcano.sh/vgpu-cores: "50"
              volcano.sh/vgpu-memory: "4544"
              volcano.sh/vgpu-number: "1"
...

image

@archlitchi
Copy link
Contributor

check curl {volcano scheduler cluster ip}:8080/metrics

@weapons97
Copy link
Author

check curl {volcano scheduler cluster ip}:8080/metrics

image
可以看到有300 cores 的,这合理吗

@archlitchi
Copy link
Contributor

which version of volcano are you using?

@weapons97
Copy link
Author

which version of volcano are you using?

1.9.0

@archlitchi
Copy link
Contributor

fixed, related PR:volcano-sh/volcano#3774

waiting for volcano-community to integrate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants