-
Notifications
You must be signed in to change notification settings - Fork 1
Description
I struggled a bit with implementing the location AggregationInstruction as a variable in a netCDF file.
Can we remember why we decided on the current format? Here's an example (example 3)
dimensions:
// Extra dimensions
i = 4 ;
j = 2 ;
variables:
// Aggregation definition variables
int location(i, j) ;
data:
location = 6, 6,
1, _,
73, _,
144, _ ;
Here, the i and j dimensions depend on how many dimensions the variable has and the maximum number of fragments across each dimension.
From an implementation point of view it makes more sense to have location follow the dimensions for the other AggregationInstructions, with an extra dimension (k) that is always 2. Then the location consists of actual locations (rather than spans) and the file contains the information you need to index the parent array. The current method requires the parser to build the information required to actually index the parent array.
I'm also a bit concerned that the current method does not really work for sparse arrays. Here is how I'd change the specification:
dimensions:
// Fragment dimensions
f_time = 2 ;
f_level = 1 ;
f_latitude = 1 ;
f_longitude = 1 ;
// Extra dimensions
k = 2;
variables:
// Aggregation definition variables
int location(f_time, f_level, f_latitude, f_longitude, k) ;
data:
location = 0, 6,
0, 1,
0, 73,
0, 144,
6, 12,
0, 1,
0, 73,
0, 144 ;
The main downside is that the representation is not as compact, but the CFA Aggregation Files are going to be very small compared to the Aggregated data anyway.
Let me know what you think @davidhassell , @bnlawrence , @JonathanGregory and @sadielbartholomew.