-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Labels
enhancementNew feature or requestNew feature or request
Description
When training multiple label datasets, we have a use case for having merge inside a concat. Ideally we will just have every variables we need in each labeled dataset, but it is often that one dataset is missing some variables. For example, if c96-shield does not have CO2, it will be nice if we can just merge a separate dataset. The resulting config will look something like this:
dataset:
concat:
- data_path: /climate-default/2024-06-20-era5-1deg-8layer-1940-2022-netcdfs
labels:
- era5
subset:
start_time: '1979-01-01T00:00:00'
stop_time: '1986-03-31T18:00:00'
- merge:
- data_path: /climate-default/2024-07-24-vertically-resolved-c96-1deg-shield-amip-ensemble-dataset/netCDFs/ic_0001
labels:
- c96-shield
subset:
start_time: '1979-01-01'
stop_time: '2020-12-31'
- data_path: /climate-default/shield-co2-data
labels:
- c96-shieldWe support concat inside merge but not the other way around because concat is supposed to concatenate across time. But in the context of multi-label datasets, each concat is actually a different data source, it'll be convenient if we can have merge within each dataset.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request