This repository was archived by the owner on Oct 31, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 62
This repository was archived by the owner on Oct 31, 2023. It is now read-only.
Facing error when loading the checkpoints after training cpc/train.py #6
Copy link
Copy link
Open
Description
I have trained the cpc/train.py model. When I was evaluating the model using pc/eval/linear_separability.py using checkpoints saved by training cpc/train.py I got an error "FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/CPC_audio/cpc/checkpoints/checkpoint_args.json' ". After checking the file I found that no checkpoint_args.json file was saved. Code is also not saving "checkpoint_args.json".
`def run(trainDataset,
valDataset,
batchSize,
samplingMode,
cpcModel,
cpcCriterion,
nEpoch,
pathCheckpoint,
optimizer,
scheduler,
logs):
print(f"Running {nEpoch} epochs")
startEpoch = len(logs["epoch"])
bestAcc = 0
bestStateDict = None
start_time = time.time()
for epoch in range(startEpoch, nEpoch):
print(f"Starting epoch {epoch}")
utils.cpu_stats()
trainLoader = trainDataset.getDataLoader(batchSize, samplingMode,
True, numWorkers=0)
valLoader = valDataset.getDataLoader(batchSize, 'sequential', False,
numWorkers=0)
print("Training dataset %d batches, Validation dataset %d batches, batch size %d" %
(len(trainLoader), len(valLoader), batchSize))
locLogsTrain = trainStep(trainLoader, cpcModel, cpcCriterion,
optimizer, scheduler, logs["logging_step"])
locLogsVal = valStep(valLoader, cpcModel, cpcCriterion)
print(f'Ran {epoch + 1} epochs '
f'in {time.time() - start_time:.2f} seconds')
torch.cuda.empty_cache()
currentAccuracy = float(locLogsVal["locAcc_val"].mean())
if currentAccuracy > bestAcc:
bestStateDict = fl.get_module(cpcModel).state_dict()
for key, value in dict(locLogsTrain, **locLogsVal).items():
if key not in logs:
logs[key] = [None for x in range(epoch)]
if isinstance(value, np.ndarray):
value = value.tolist()
logs[key].append(value)
logs["epoch"].append(epoch)
if pathCheckpoint is not None \
and (epoch % logs["saveStep"] == 0 or epoch == nEpoch-1):
modelStateDict = fl.get_module(cpcModel).state_dict()
criterionStateDict = fl.get_module(cpcCriterion).state_dict()
fl.save_checkpoint(modelStateDict, criterionStateDict,
optimizer.state_dict(), bestStateDict,
f"{pathCheckpoint}_{epoch}.pt")
utils.save_logs(logs, pathCheckpoint + "_logs.json")`
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels