I would like to use dataframe regression on dataframe with columns multiindex. Doing so I faced an issues with numerical errors: default_tolerance or tolerances had no action.
Following test illustrate the issue:
def test_multiindex_cases(dataframe_regression: DataFrameRegressionFixture, no_regen):
data1 = 1.1 * np.ones(5)
data2 = 2.2 * np.ones(5)
df = pd.DataFrame.from_dict({"data1": data1, "data2": data2})
df.columns = pd.MultiIndex.from_tuples((('data1', 'a'), ('data2', 'b')))
dataframe_regression.check(df)
# introduce numerical error
df.loc[3, ('data1', 'a')] += 0.01
dataframe_regression.check(df,
default_tolerance=dict(atol=1e-1, rtol=1e-17))
associated reference data is:
,data1,data2
,a,b
0,1.1000000000000001,2.2000000000000002
1,1.1000000000000001,2.2000000000000002
2,1.1000000000000001,2.2000000000000002
3,1.1000000000000001,2.2000000000000002
4,1.1000000000000001,2.2000000000000002
inducing :
E AssertionError: Values are not sufficiently close.
E To update values, use --force-regen option.
E
E data1:
E obtained_data1 expected_data1 diff
E 4 1.1100000000000001 1.1000000000000001 ?
E
E WARNING: diffs for this kind of data type cannot be computed.
When reading the reference file dataframeregression does not recognise that the first two lines shall be considered as headers. So resulting columns are dealt as object and numeral check is hence not performed (similar to #92).
I have worked around the issue updating my dataframe columns using df.columns.to_flat_index (link to pandas doc)
Sould we update dataframe regression documentation to include this trick or update the code to apply to_flat_index to all dataframes ?
Might be related to #47
I would like to use dataframe regression on dataframe with columns multiindex. Doing so I faced an issues with numerical errors:
default_toleranceortoleranceshad no action.Following test illustrate the issue:
associated reference data is:
inducing :
When reading the reference file dataframeregression does not recognise that the first two lines shall be considered as headers. So resulting columns are dealt as object and numeral check is hence not performed (similar to #92).
I have worked around the issue updating my dataframe columns using
df.columns.to_flat_index(link to pandas doc)Sould we update dataframe regression documentation to include this trick or update the code to apply
to_flat_indexto all dataframes ?Might be related to #47