It seems the exit code for every subprocess in each job isn't being checked before proceeding. If so, one result is unclear status reporting and extra manual investigation when things go wrong. It's best to check for errors at each step (including missing output files) and abort ASAP, with a clear log message on what error was detected and a non-zero exit code.