You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are currently three schemes in vllm for system monitoring and performance analysis: OTel, prometheus, and torch profiler.
How about to provide a service profiler which just reflect the detail events inner vllm framework(not include torch).
For example, the service profiler should record each request arrival time and finish time(not the life cycle of vllm service, user can control the start time and end time of service profiler), some important events during its life cycle, such as queue switching, prefill forward, decode forward, token sampling, block swap in/out and relevant request id and so on.
Just like torch profiler, service profiler provide the system performance data with trace event format, perfetto/chrome trace will make the inference system no longer a black boxed to user and more conducive to analyzing the performance issues of the vllm framework.
Service profiler should be a light offline tools for performance analysis. An example like:
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
There are currently three schemes in vllm for system monitoring and performance analysis: OTel, prometheus, and torch profiler.
How about to provide a service profiler which just reflect the detail events inner vllm framework(not include torch).
For example, the service profiler should record each request arrival time and finish time(not the life cycle of vllm service, user can control the start time and end time of service profiler), some important events during its life cycle, such as queue switching, prefill forward, decode forward, token sampling, block swap in/out and relevant request id and so on.
Just like torch profiler, service profiler provide the system performance data with trace event format, perfetto/chrome trace will make the inference system no longer a black boxed to user and more conducive to analyzing the performance issues of the vllm framework.
Service profiler should be a light offline tools for performance analysis. An example like:
Beta Was this translation helpful? Give feedback.
All reactions