-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Description
我注意到代码中有token
import torch
import torch.nn as nn
import torch.nn.functional as F
layer_dict = {2:0,6:1,15:2} #
sparse_token_list_192 = [300,200,110] # 2576 4300 10200 16110
sparse_token_list_128 = [303,110,36]
sparse_token_list_64 = [66,30,17]
sparse_token_dict = {
192: sparse_token_list_192,
128: sparse_token_list_128,
64: sparse_token_list_64,
}
可是
(2576+4300+10200+16110)/32=191
为什么论文中会得到平均token数为192?
(2576+4303+10110+1636)/32=126.25
为什么论文中会得到平均token数为128?
Metadata
Metadata
Assignees
Labels
No labels