-
Notifications
You must be signed in to change notification settings - Fork 57
Open
Description
I can't load last.ckpt of my fine-tuned model:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[6], [line 5](vscode-notebook-cell:?execution_count=6&line=5)
[3](vscode-notebook-cell:?execution_count=6&line=3) ckpt="logs/myexp/checkpoints/last.ckpt"
[4](vscode-notebook-cell:?execution_count=6&line=4) config = OmegaConf.load(f"{config}")
----> [5](vscode-notebook-cell:?execution_count=6&line=5) model = load_model_from_config(config, f"{ckpt}")
[6](vscode-notebook-cell:?execution_count=6&line=6) sampler = DDIMSampler(model)
Cell In[4], [line 38](vscode-notebook-cell:?execution_count=4&line=38)
[36](vscode-notebook-cell:?execution_count=4&line=36) if "global_step" in pl_sd:
[37](vscode-notebook-cell:?execution_count=4&line=37) print(f"Global Step: {pl_sd['global_step']}")
---> [38](vscode-notebook-cell:?execution_count=4&line=38) sd = pl_sd["state_dict"]
[39](vscode-notebook-cell:?execution_count=4&line=39) model = instantiate_from_config(config.model)
[40](vscode-notebook-cell:?execution_count=4&line=40) m, u = model.load_state_dict(sd, strict=False)
KeyError: 'state_dict'Probably because the model was not saved correctly, after the fine-tuning is finished it crashes:
Epoch 0: 10%| | 61001/616605 [5:46:21<52:34:42, 2.94it/s, loss=0.166, v_num=0, train/l
Saving latest checkpoint...
Traceback (most recent call last):
File "main.py", line 779, in <module>
trainer.test(model, data)
File "/home/federico/Desktop/InST/.venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 911, in test
return self._call_and_handle_interrupt(self._test_impl, model, dataloaders, ckpt_path, verbose, datamodule)
File "/home/federico/Desktop/InST/.venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/federico/Desktop/InST/.venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 954, in _test_impl
results = self._run(model, ckpt_path=self.tested_ckpt_path)
File "/home/federico/Desktop/InST/.venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1128, in _run
verify_loop_configurations(self)
File "/home/federico/Desktop/InST/.venv/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 42, in verify_loop_configurations
__verify_eval_loop_configuration(trainer, model, "test")
File "/home/federico/Desktop/InST/.venv/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 186, in __verify_eval_loop_configuration
raise MisconfigurationException(f"No `{loader_name}()` method defined to run `Trainer.{trainer_method}`.")
pytorch_lightning.utilities.exceptions.MisconfigurationException: No `test_dataloader()` method defined to run `Trainer.test`.Edit:
Even if i comment this lines and no exception is raised, the checkpoint is not saved correctly:
if not opt.no_test and not trainer.interrupted:
trainer.test(model, data)Metadata
Metadata
Assignees
Labels
No labels