You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Human readers have the potential to serve as judges. However, the discrepancy between evaluation prompt episode rendering for human readers causes trouble on the issue.
Description
Human readers have the potential to serve as judges. However, the discrepancy between evaluation prompt episode rendering for human readers causes trouble on the issue.
Additional Information
I am specifically talking about:
And there's discrepancy between how evaluation prompt is composed:
The text was updated successfully, but these errors were encountered: