Inquiry Regarding Vocabulary Size in CodonBERT PyTorch Model

I would like to express my gratitude for your excellent work on CodonBERT. I have been thoroughly impressed by your research and the accompanying code.

However, I have encountered a discrepancy that I would like to clarify. In your paper and code, the vocabulary size is mentioned as 5*5*5+5=130, based on the characters 'A', 'U', 'G', 'C', and 'N'. Yet, in the CodonBERT PyTorch model you provided, the vocabulary size is set to 69.

Could you please explain the rationale behind this difference in vocabulary size? Understanding this would greatly help me in comprehending and utilizing your model more effectively.

Thank you in advance for your assistance.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry Regarding Vocabulary Size in CodonBERT PyTorch Model #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inquiry Regarding Vocabulary Size in CodonBERT PyTorch Model #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions