Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么只在LLMEmbedder的encode前加下划线,前面的模型没有 #1114

Open
dream-tentacle opened this issue Sep 20, 2024 · 2 comments

Comments

@dream-tentacle
Copy link

def _encode(self, sentences: Union[List[str], str], batch_size: int = 256, max_length: int = 512) -> np.ndarray:

是否应该删除下划线以保持一致性

@ZiyiXia
Copy link
Collaborator

ZiyiXia commented Sep 21, 2024

在python函数命名标准里,一般以单下划线起命名的函数是供内部其他函数调用的,不直接在API中使用
参考PEP 8 – Style Guide for Python Code

_single_leading_underscore: weak “internal use” indicator. E.g. from M import * does not import objects whose names start with an underscore.

LLMEmbedder是根据6个具体任务分别用不同的query instruction和key instruction进行微调的,所以在encode时需要针对不同任务对query和key选择不同的instruction,建议直接使用函数encode_queries()encode_keys()(他们都分别调用了_encode()),用法可以参考LLMEmbedder的README

@dream-tentacle
Copy link
Author

@ZiyiXia 谢谢,但我是和前面其他模型(FlagLLMModel、FlagModel)对比的,它们的接口看起来都是差不多的,如果是这样的话前面两个是否应该加上下划线?谢谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants