You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's interesting that you notice a difference between ChatGPT and HostedGPT in this regard, but it's plausible that the algorithm for managing history is different. I actually did something really naive and intended to go back and optimize it at some point but I never did. It's right here: https://github.com/AllYourBot/hostedgpt/blob/main/app/services/ai_backend/open_ai.rb#L69
First, the max_tokens should really be:
max_length_of_response_for_good_user_experience = 3000 # hard coded value we can tweak
[ input_tokens + max_length_of_response_for_good_user_experience, context_limit_of_model ].min
I even added the Tiktoken gem to the project to prepare for doing accurate token counting but haven't addressed this. In addition, I also never got around to truncating history. It looks like preceding_messages is always returning all preceding messages. If I'm reading the code correctly, the preceding messages should eventually exceed the models context length and start erroring out. This needs to be fixed at some point. The method to get preceding messages should be:
It's interesting that you notice a difference between ChatGPT and HostedGPT in this regard, but it's plausible that the algorithm for managing history is different. I actually did something really naive and intended to go back and optimize it at some point but I never did. It's right here: https://github.com/AllYourBot/hostedgpt/blob/main/app/services/ai_backend/open_ai.rb#L69
First, the max_tokens should really be:
I even added the Tiktoken gem to the project to prepare for doing accurate token counting but haven't addressed this. In addition, I also never got around to truncating history. It looks like
preceding_messages
is always returning all preceding messages. If I'm reading the code correctly, the preceding messages should eventually exceed the models context length and start erroring out. This needs to be fixed at some point. The method to get preceding messages should be:preceding_messages_up_to_max_tokens_of(max_input_tokens_allowed)
(I'm naively naming these things just for pseudo-code purposes)
The text was updated successfully, but these errors were encountered: