-
Notifications
You must be signed in to change notification settings - Fork 42
Open
Description
Hello there.
I tried to use m2_bert_80M_2k to generate embeddings for text strings with lengths around 500 tokens following your example on Huggingface](https://huggingface.co/togethercomputer/m2-bert-80M-2k-retrieval). However, the outputs = model(**input_ids) line tooks over 15s on average, slower than expected. Could you please help me find the issue here?
I also tested your example for 12 tokens. The model forwarding process is still slow (over 5s for 12tokens & padding="longest", over 16s for 12tokens & padding="max_length"(=2048).


Thanks in advance!
Metadata
Metadata
Assignees
Labels
No labels