Equivalent issue in models OpenADMET/openadmet-models#37
We need the ability to discretise continuous data into binary or multiclass labels based on two main conditions
Based on thresholds:
Thresholds (preferably based on some kind of scientifically reasonable criterion)
Polaris folks have a great example of how this is done here:
https://github.com/polaris-hub/auroris/blob/main/auroris/curation/actions/_discretize.py
Based on the distribution of values. (ie equal bins, quartiles or clustering). We need to discuss this further as is likely only applicable in certain situations where dynamic range is large.
At this point its unclear whether we should implement here or in models, just collecting issues here.
Equivalent issue in
modelsOpenADMET/openadmet-models#37We need the ability to discretise continuous data into binary or multiclass labels based on two main conditions
Based on thresholds:
Thresholds (preferably based on some kind of scientifically reasonable criterion)
Polaris folks have a great example of how this is done here:
https://github.com/polaris-hub/auroris/blob/main/auroris/curation/actions/_discretize.py
Based on the distribution of values. (ie equal bins, quartiles or clustering). We need to discuss this further as is likely only applicable in certain situations where dynamic range is large.
At this point its unclear whether we should implement here or in models, just collecting issues here.