For each of the benchmarks it would be helpful if the exact configuration was included with their benchmarks.
There is for example some uncertainty as to whether all the tests were run with VRR enabled. The sway jitter might suggests this was not the case. In addition, for sway setting a positive output max_render_time will ensure that frame callbacks are emitted before any render tasks, which lowers variability.