-
Notifications
You must be signed in to change notification settings - Fork 65
Open
Description
I'm trying to replicate basic results without filtering for now. There are some major differences between the models I trained locally and the ones on huggingface. The accuracy of the locally trained models is worse, when doing linear probing the differences are even larger (20% acc. from locally trained model vs. 44% from huggingface model on CIFAR100)
| Dataset | Encoder | Zero-shot Test | Linear Probe Test |
|---|---|---|---|
| cifar10 | commonpool_s_s13m_b4k | 0.4077 | 0.685 ± 0.0014 |
| cifar10 | local_commonpool_s_s13m_b4k_0 | 0.3572 | 0.4694 ± 0.0106 |
| cifar10 | local_commonpool_s_s13m_b4k_1 | 0.3443 | 0.4565 ± 0.0143 |
| cifar10 | local_commonpool_s_s13m_b4k_3 | 0.3406 | 0.4609 ± 0.0126 |
| cifar10 | local_commonpool_s_s13m_b4k_4 | 0.3346 | 0.469 ± 0.0141 |
| cifar10 | local_commonpool_s_s13m_b4k_2 | 0.3323 | 0.4447 ± 0.0164 |
| vtab/cifar100 | commonpool_s_s13m_b4k | 0.1297 | 0.4355 ± 0.0025 |
| vtab/cifar100 | local_commonpool_s_s13m_b4k_1 | 0.1246 | 0.2024 ± 0.0035 |
| vtab/cifar100 | local_commonpool_s_s13m_b4k_0 | 0.1168 | 0.1997 ± 0.0085 |
| vtab/cifar100 | local_commonpool_s_s13m_b4k_3 | 0.1139 | 0.2004 ± 0.0066 |
| vtab/cifar100 | local_commonpool_s_s13m_b4k_2 | 0.1138 | 0.2002 ± 0.0043 |
| vtab/cifar100 | local_commonpool_s_s13m_b4k_4 | 0.1128 | 0.2047 ± 0.0044 |
- To my understanding, just calling
train.py --scale smallon the unmodified commonpool dataset should replicate the no-filter baselinecommonpool_s_s13m_b4k. Is that right? - I ran five different seeds for the pretraining and for each ten different seeds for the linear probing. Why are the results so different from the online models?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels