At the end of the pipeline, use the outputs from the "all data", "top n" and "main effect lasso" stage to create the following:
a table with the following columns:
- predictor -- parse this out of
all_data_formula that is generated in the interface
- all_data -- This stores the appropriate STAGE_RESULT level taht results from the all data modeling stage
- topn -- This stores the appropriate STAGE_RESULT level that results from the topn modeling stage
- main_effect -- This stores the appropriate STAGE_RESULT level for the mTF main effect according to the final main effect/interactor stage
- mTF -- This stores the appropriate STAGE_RESULT level for the mTF itself according to the final main effect/interactor stage
STAGE_RESULT
You can implement this as an enum. Regardless of whether it is formally implemented, this has the following levels:
- positive: the coefficient interval (or the coefficient itself in the case of the final stage) is a non-zero positive number
- negative: same as above, but a non-zero negative
- zero: the coefficient interval contains 0, or is 0 in the case of the final stage
- none: the predictor at a given stage does not exist. Ie, the all_data column would have entirely "positive", "negative" or "zero" since all predictors are present at that point. But in the "topn" stage, you would use the "none" level for those predictors which are not used because they were "zero" in the all data stage.
At the end of the pipeline, use the outputs from the "all data", "top n" and "main effect lasso" stage to create the following:
a table with the following columns:
all_data_formulathat is generated in the interfaceSTAGE_RESULT
You can implement this as an enum. Regardless of whether it is formally implemented, this has the following levels: