Skip to content

Can not successfully train using the script. #1

@JianxXiong

Description

@JianxXiong

In open-rs training setting, if I use a single machine with two gpus, the model would be automatically loaded into cpu memory, which leads to parameters in cpu, but data in gpu, this results in an error. Howerver, if I use a single machine with one gpu, here is an error arising from accelerate/accelerator.py in line 2321, self.deepspeed_engine_wrapped.backward(loss, **kwargs), with error info AttributeError: 'NoneType' object has no attribute 'backward'. I wonder if the code is complete, or the version of packages matters. If possible, can you release the env info in your training machine? Wish you a good day.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions