-
Notifications
You must be signed in to change notification settings - Fork 515
Encode Sentences
Tianyu Gao edited this page May 19, 2021
·
2 revisions
After loading the model as model
, you can encode sentences by
model.encode(sentences, device=None, return_numpy=False, normalize_to_unit=True, keepdim=False, batch_size=64, max_length=128)
Inputs
-
sentences
: a string or a list of strings. -
device
:cuda
orcpu
. -
return_numpy
: whether to return numpy arrays (True
) or return PyTorch tensors (False
by default). -
normalize_to_unit
: whether to normalize the output embeddings as unit vectors. -
keepdim
: if the input is a single sentence, whether to keep the batch-size dimension of the output embedding. -
batch_size
: if the input is a list of sentences, the batch size for encoding. Usually larger batch sizes lead to higher efficiency (as long as it can fit into your computing devices). -
max_length
: truncate the sentences if they exceed the maximum length.
Outputs
- A numpy array or a PyTorch tensor with size
(n, dim)
, wheren
is # sentences.