Different Vocab Size Between Tokenizer and Model's Word Embedding Layer

## Expected Behavior
The length of tokenizer vocab size and the BERT's word embedding layer dimension should be the same

## Actual Behavior
The length of tokenizer vocab size and the BERT's word embedding layer dimension is not the same

## Steps to Reproduce the Problem
  1. Load the model: `model = AutoModel.from_pretrained('indobenchmark/indobert-base-p1')`
  2. Print the model: `print(model)` 
     
![image](https://user-images.githubusercontent.com/53264414/126764147-b4f24499-e0dc-4cc4-8aee-e91eae241176.png)

  3. Load the tokenizer: `tokenizer = AutoTokenizer.from_pretrained('indobenchmark/indobert-base-p1')`
 4. Print the length of toikenizer: `print(len(tokenizer))`
![image](https://user-images.githubusercontent.com/53264414/126764151-a0084053-c8c3-4578-85ec-6e22b94c8de3.png)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different Vocab Size Between Tokenizer and Model's Word Embedding Layer #33

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Different Vocab Size Between Tokenizer and Model's Word Embedding Layer #33

Description

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions