cache kv是否会出现失效问题 #1136
-
多线程情况下,如果GPU服务器资源不足,会出现cache kv抢占吗?例如,同时有两个对话,而且都很长。如果两个会话中途,会话a切入会话b,则会话b中的缓存kv应该丢失,即使用会话a的缓存kv。由于涉及gpu计算且资源不足,无法进行验证 |
Beta Was this translation helpful? Give feedback.
Answered by
zRzRzRzRzRzRzR
Apr 21, 2024
Replies: 2 comments 1 reply
Answer selected by
zRzRzRzRzRzRzR
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
会