Skip to content

instability in training drail in the walker environment. #4

@lbwnbzzh

Description

@lbwnbzzh

Dear author, hello. I have a bit of confusion. Why does the imitation reward for drail in the walker environment need to add a constant term, and it seems that the training in the walker environment is not very stable (using the five seeds from the yaml)? I would be very grateful if you could reply to me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions