Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nv_inference_count no longer includes gpu_uuid? #7479

Open
chriscarollo opened this issue Jul 26, 2024 · 3 comments
Open

nv_inference_count no longer includes gpu_uuid? #7479

chriscarollo opened this issue Jul 26, 2024 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@chriscarollo
Copy link

I have some grafana graphs using Triton's prometheus metrics, and it appears that in a semi-recent update that nv_inference_count no longer includes a gpu_uuid field (I see only "model" and "version"). I have a graph showing the number of inferences per gpu, which no longer works.

@rmccorm4
Copy link
Collaborator

rmccorm4 commented Jul 31, 2024

Hi @chriscarollo, have you used the tritonserver --model-control-mode EXPLICIT ... (or POLL) feature to dynamically load/unload models before? I believe there may be a known inconsistency where models loaded at startup have no GPU_ID label for non-GPU metrics, and models dynamically loaded later on after server has started do have these GPU_ID labels applied to other non-GPU related metrics.

Please let me know if you can consistently identify or reproduce this behavior one way or the other.

@rmccorm4 rmccorm4 added the question Further information is requested label Jul 31, 2024
@chriscarollo
Copy link
Author

I'm actually using model-control-mode POLL and it does appear that my gpu_id labels did come back after it detected new versions. So it does look like maybe only an issue on initial startup?

@rmccorm4
Copy link
Collaborator

rmccorm4 commented Aug 1, 2024

Hi @chriscarollo, this is a known issue and has a proposed resolution in this PR: triton-inference-server/core#321. Please chime in on the discussion with your use case, impact, etc.

@rmccorm4 rmccorm4 self-assigned this Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants