Skip to content

Can't pickle <class 'dlimp.dataset.DLataset'> #6

@DelinQu

Description

@DelinQu

Hi kvablack,
thanks for your great work! I'm using dlimp to read the OpenX dataset and train a model using huggingface Trainer for muti-GPU acceleration. In multi-GPU training, an issue involving serialization occurs when loading the dataset and starting the training. The following error is encountered:

train_dataset = build_datasets(
    data_args,
)

# TEST: dump using pickle
import pickle
filehandler = open("outputs/dataset.obj","wb")
pickle.dump(train_dataset, filehandler)

# [error]
pickle.PicklingError: Can't pickle <class 'dlimp.dataset.DLataset'>: it's not the same object as dlimp.dataset.DLataset

I believe the issue lies in the serialization of the dlimp Dataset or tf Dataset. Could you provide any solution or suggestions?
Many Thanks! 🤗

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions