if self.log_completions and self.state.global_step % self.args.logging_steps == 0:
prompts_to_log = gather_object(prompts_text)
completions_to_log = gather_object(completions_text)
rewards = gather_object(rewards)
rewards_to_log = rewards.tolist()
if self.accelerator.is_main_process:
if is_rich_available():
print_prompt_completions_sample(
prompts_to_log,
completions_to_log,
rewards_to_log,
self.state.global_step,
)
The print_prompt_completions_sample function's functionality requires rewards to be accepted as a dict (containing key, value) to achieve its purpose, however, the input rewards at this step seems to be a list (Even if it was not converted to a list).
The
print_prompt_completions_samplefunction's functionality requires rewards to be accepted as a dict (containing key, value) to achieve its purpose, however, the input rewards at this step seems to be a list (Even if it was not converted to a list).