this could be done in individual table cells, e.g. using a cell background color or changing the font weight to highlight which model(s) is/are the best for each metric. For proper scores these are the ones with lowest scores in each column. For interval coverage rates, these are the ones with coverage rates closest to the nominal level.
this issue is closely related to #19, and could potentially be tackled together with that one (or at minimum, it may be helpful to consider code organization for this issue alongside that one).