Conversation
…pidp and old_hidp (broke non-synthetic runs). Also added the faster multiprocessed randomForest model
…and fixed the handovers make targets to work with the new output path parameter
ld-archer
left a comment
There was a problem hiding this comment.
Have added the functionality to run default simulation with mice data locally now for testing and development. Handovers from this show that predicted income needs some work on scaling. Could perhaps try log-normal LMM as in branch 231.
Also reverted the old_pidp stuff back to just pidp as it broke all non-synthetic simulation runs. The idea is good and I understand the need but the execution needs to work with both synthetic and non-synthetic runs.
Can see that the data prep is done for cross validation but none of the simulation targets use this data, and visualisation script is not set up to visualise anything but default non-mice runs. CV even more important here as we're separating the transition and input populations.
Nothing set up for SIPHER7 experiment. If we are going to shift all our default runs to a mice imputed population then we need an input population that works for both. I'm happy to do this myself when the above problems are sorted.
Rudimentary MICE into the data pipeline for MINOS.
added full_mice_dataset command that runs MICE on arc4 node. takes about 20mins when the job load.
data pipeline split into two making two pops for transition and input data respectively.
TODO in the future