Skip to content

ANI-2x training data set #39

@JMorado

Description

@JMorado

Hi,

How does one know what was the exact data set used to train ANI-2x?

In the original ANI-2x paper, it is said that the training data set is composed of molecules from a variety of sources, including the GDB-11 database, the CheMBL database, the s66x8 benchmark, and some randomly generated amino acids and dipeptides.
Nevertheless, from what I understood, these data sets are not included integrally because some specific sampling techniques are then employed.

Is it possible to know which were the exact molecules used for training?

Thank you.
Best,
João

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions