Skip to content

Why do not you calculate the real output token for throughput? #493

Answered by zhuohan123
wqh17101 asked this question in Q&A
Discussion options

You must be logged in to vote

I believe the main reason is that prompt tokens also take compute. There are different ways to measure throughput for LLMs, but I believe the trends between systems should be similar under different metrics. Move this issue to discussions for future questions.

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@zhaoyang-star
Comment options

Answer selected by zhuohan123
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants
Converted from issue

This discussion was converted from issue #435 on July 18, 2023 06:49.